DESCRIPTION
LOW COMPUTATONAL-COMPLEXITY DIGITAL FILTER BANK
Technical Field
The invention relates in general to digital encoding and decoding of information. More particularly, the invention relates to efficient implementation of digital analysis and synthesis filter banks used in digital encoding and decoding. In a preferred embodiment of the invention, the length of the filter bank used to implement critically-sampled analysis and synthesis filter banks may be adaptively selected.
Throughout the following discussion and especially in the background discussion, more particular mention will be made of audio applications; however, it should be understood that the present invention is applicable to a range of applications wider than just that of audio encoding and decoding.
Background Art
Introduction
There is considerable interest among those in the field of signal processing to develop efficient means to transmit or store information. Improving coding efficiency includes (1) reducing informational requirements, that is, reducing the amount of information required to adequately represent a signal during transmission or storage, and (2) reducing processing requirements, that is, reducing the amount of processing required to implement the encoding and decoding processes. In high-quality audio coding applications, informational requirements can sometimes be reduced without loss of perceptible audio quality by exploiting various psychoacoustic effects. Signal recording, transmitting, or reproducing techniques which divide the useful signal bandwidth into narrow bands with bandwidths approximating the human ear's critical bands can exploit
psychoacoustic masking effects. Such techniques divide the signal bandwidth with an analysis filter bank, process the signal passed by each filter band, and reconstruct a replica of the original signal with a synthesis filter bank.
Two common coding techniques are subband coding and transform coding. Subband coders and transform coders can reduce the informational requirements in particular frequency bands where the noise caused by the resulting coding inaccuracy is psychoacoustically masked. Subband coders may be implemented by a bank of digital bandpass filters defining subbands of varying bandwidth.
Transform coders may be implemented by any of several time-domain to frequency-domain transforms. One or more adjacent transform coefficients are grouped together to define "subbands" having effective bandwidths which are sums of individual transform coefficient bandwidths.
The mathematical basis for digital subband filter banks and digital block transforms is essentially
the same. See Tribolet and Crochiere, "Frequency Domain Coding of Speech," IEEE Trans.
Acoust., Speech, and Signal Proc. ASSP-27, October, 1979, pp. 512-30. Therefore, throughout the following discussion the concepts associated with terms such as "subband coder" and "transform coder" generally apply to both a true subband coder and a transform coder. The term "subband" refers to portions of the useful signal bandwidth whether implemented by a true subband coder or a transform coder. The terms "transform" and "transforming" include digital filters and digital filtering, respectively.
In most digital coding applications, processing requirements can be reduced by increasing the efficiency of subband filtering. Improved processing efficiency permits implementation of encoders and decoders which are less expensive to build, or which impose lower signal propagation delays through an encoder/decoder system.
In many transform coder systems, the analysis and synthesis filter banks are implemented by discrete time-domain to frequency-domain transforms such as the Discrete Fourier Transform (DFT), the Discrete Cosine Transform (DCT), and the Discrete Sine Transform (DST). The number of time- domain signal samples, referred to herein as the time-domain signal sample block length, processed by such transforms is sometimes called the transform length, and the amount of processing required to perform these transforms is generally proportional to the square of the time-domain signal sample block length.
The number of frequency-domain transform coefficients generated by a transform is also sometimes called the transform length. It is common for the number of frequency-domain transform coefficients generated by the transform to be equal to the time-domain signal sample block length, but this equality is not necessary. For example, one transform referred to herein as the E-TDAC transform is sometimes described in the art as a transform of length N that transforms signal sample blocks with a length of 2N samples. It is possible, however, to also describe the transform as one of length N -which generates only ½N unique frequency-domain transform coefficients. Thus, in this discussion the time-domain signal sample block length and the discrete transform length are generally assumed to be synonyms.
Various techniques have been utilized to reduce the amount of time required to perform a transform, or to reduce the processing power required to perform a transform in given amount of time, or both. One technique is taught in Narasimha and Peterson, "On the Computation of the Discrete Cosine Transform," IEEE Trans, on Communications. COM-26, June, 1978, pp. 934-36. Briefly, this technique evaluates an N-point DCT by rearranging or "shuffling" the samples representing the input signal, performing an N-point DFT on the shuffled samples, and multiplying the result with a complex function. It is approximately twice as efficient as other techniques using a 2N-point FFT; however, Narasimha and Peterson only teach how to improve the efficiency of filter banks implemented by one particular DCT.
Another technique which yields approximately a two-fold increase in processing efficiency
concurrently performs two real-valued discrete transforms of length N with a single complex-valued FFT of length N. A transform coder utilizing this technique to concurrently perform a modified DCT with a modified DST is described in International Patent Application PCT/US 91/02512, Publication No. WO 91/16769 (published October 31, 1991). The significance of these particular modified DCT and modified DST is discussed in Princen and Bradley, "Analysis/Synthesis Filter Bank Design Based on Time Domain Aliasing Cancellation," IEEE Trans, on Acoust.. Speech. Signal Proc. ASSP-34, 1986, pp. 1153-1161. The authors describe a specific application of these transforms as the time-domain equivalent of an evenly-stacked critically-sampled single-sideband analysis-synthesis system. They are referred to collectively herein as the Evenly-stacked Time-Domain Aliasing Cancellation (E-TDAC) transform.
Another technique to reduce processing requirements is taught by Malvar, "Lapped Transforms for Efficient Transform/Subband Coding," IEEE Trans. Acoust.. Speech. Signal Proc. ASSP-38, June, 1980, pp. 969-78. This technique implements an N-point modified DCT by performing a ½N-point DST after combining pairs of the samples representing the input signal, or "folding" the N input signal samples into a smaller set of ½N points. It is approximately twice as efficient as performing the modified DCT in a straight-forward manner; however, Malvar only teaches how to fold input samples for a filter bank implemented by one specific modified DCT whose input samples have been weighted by a specific sine-tapered analysis window.
The specific modified DCT implemented by Malvar is discussed in greater detail by Princen, Johnson, and Bradley, "Subband/Transform Coding Using Filter Bank Designs Based on Time Domain Aliasing Cancellation," ICASSP 1987 Conf. Proc. May 1987, pp. 2161-64. The authors describe this transform as the time-domain equivalent of an oddly-stacked critically sampled single-sideband analysis-synthesis system. It is referred to herein as the Oddly-stacked Time-Domain Aliasing Cancellation (O-TDAC) transform.
It is desirable to implement encoders and decoders with the ability to use different time-domain signal sample block lengths in order to optimize coder performance. It is well known in the art that longer time-domain signal sample block lengths improve the selectivity or frequency-resolving power of transform coders, and better filter selectivity generally improves the ability of a transform coder to exploit psychoacoustic masking effects.
But longer time-domain signal sample block lengths degrade the time-resolution of a subband filter bank. Inadequate time-resolution can produce audible distortion artifacts when quantizing errors of signal events, such as transients, producing pre-transient and post-transient ringing which exceed the ear's temporal psychoacoustic masking interval. Hence, it is desirable that techniques which improve subband filter bank processing efficiency should also permit adaptive selection of the time-domain signal sample block length.
The importance of time-domain signal sample block length and its effect upon filter bank frequency-domain resolution and time-domain resolution is discussed in more detail in International
Patent Application PCT/US 91/02512, Publication No. WO 91/16769, cited above, which is hereby incorporated by reference in its entirety.
Disclosure of Invention
It is an object of the present invention to provide for a subband/transform encoder and a subband/transform decoder of digital information by means of analysis filtering and synthesis filtering requiring lower processing requirements, or imposing lower processing delays, or both.
It is another object of the present invention to provide for a subband/transform encoder and a subband/transform decoder of digital information requiring lower processing requirements, or imposing lower processing delays, or both, by means of analysis filtering and synthesis filtering permitting adaptive selection of the filter-bank length.
Further details of the above objects and still other objects of the invention are set forth throughout this document, particularly in the Modes for Carrying Out the Invention, below. Although the invention is more particularly described for audio encoding and decoding applications, it should be appreciated that the invention is much broader and may be applied to other applications. Throughout this Description, discussion of encoders incorporating the present invention also pertains more generally to signal-analysis filtering applications, and discussion of decoders incorporating the present invention pertains more generally to signal-synthesis filtering applications.
In accordance with the teachings of the present invention in one embodiment, an encoder provides for the encoding of input signal samples representing a time-domain signal. The input samples which are weighted by an analysis-window function are buffered into time-domain signal sample blocks. Pairs of signal samples in the time-domain signal sample blocks are combined by a forward pretransform function to generate modified samples. Frequency-domain transform coefficients are generated by applying a discrete digital transform to the modified samples. Spectral information is generated by applying a forward post-transform function to the frequency-domain transform coefficients.
Also in accordance with the teachings of the present invention in one embodiment, a decoder provides for the decoding of digitally encoded spectral information. Frequency-domain transform coefficients are generated by applying an inverse pre-transform function to the spectral information. Time-domain transform coefficients are generated by applying an inverse discrete digital transform to the frequency-domain transform coefficients. Time-domain signal sample blocks are generated by applying an inverse post-transform function to the time-domain transform coefficients, and output samples which correspond to the input samples to a companion encoder are generated by overlapping and adding samples in adjacent time-domain signal sample blocks.
The various features of the present invention and its preferred embodiments are set forth in greater detail in the following Modes for Carrying Out the Invention and in the accompanying drawings.
Brief Description of Drawings
Figure 1 is a functional block diagram illustrating the basic functional structure of an encoder incorporating a preferred embodiment of the present invention.
Figure 2 is a functional block diagram illustrating the basic functional structure of a decoder incorporating a preferred embodiment of the present invention.
Figure 3 is a flowgraph illustrating the forward pre-transform function applied to a 16-sample time-domain signal sample block to form an 8-sample modified sample block for a basic embodiment of the present invention permitting implementation of an E-TDAC transform analysis filter bank by a DCT and DST.
Figure 4 is a flowgraph illustrating the forward pre-transform function applied to a 16-sample time-domain signal sample block to form an 8-sample modified sample block for an alternative embodiment of the present invention permitting implementation of an E-TDAC transform analysis filter bank by a DFT.
Figure 5 is a hypothetical graphical representation illustrating the time-reversal regions of the time-domain aliasing component created by the E-TDAC transform using the conventional TDAC phase term.
Figure 6 is a hypothetical graphical representation illustrating the time-reversal regions of the time-domain aliasing component created by the E-TDAC transform using the TDAC phase term required to cancel time-domain aliasing in an N-sample length block overlapped with a subsequent ½N-sample length block.
Figure 7 is a hypothetical graphical representation illustrating the boundary between time-reversal regions of the time-domain aliasing component in a ¼N-sample length block.
Figure 8 is a hypothetical graphical representation of a bridge transform illustrating the time-reversal regions of the time-domain aliasing component.
Figure 9 is a flowgraph illustrating the forward pre-transform function applied to a 16-sample time-domain signal sample block to form an 8-sample modified sample block permitting
implementation of an adaptive-length E-TDAC transform analysis filter bank by a DCT and DST. Figure 10 is a flowgraph illustrating the forward pre-transform function applied to a 16-sample time-domain signal sample block to form an 8-sample modified sample block permitting
implementation of an adaptive-length O-TDAC transform analysis filter bank by a DST.
Modes for Carrying Out the Invention
I. Overview of Functional Structure
Figure 1 illustrates the basic functional structure of a transform-based encoder incorporating an embodiment of the present invention. According to this structure, an encoder comprises buffer 102
which buffers input samples received from input path 100 into time-domain signal sample blocks, forward pre-transform 106 which generates modified samples by combining pairs of signal samples received from buffer 102 and in response to information received from path 104 establishing the number of signal samples constituting a time-domain signal sample block, forward transform 108 which transforms the modified samples into frequency-domain transform coefficients by applying a transform whose length is adapted in response to information received from path 104, forward posttransform 110 which generates spectral information from the frequency-domain transform coefficients and in response to the information received from path 104, and formatter 112 which assembles digital information including the spectral information into a form suitable for transmission or storage along path 114. The functions performed by buffer 102 and formatter 112 are not discussed in detail herein.
Figure 2 illustrates the basic functional structure of a transform-based decoder incorporating an embodiment of the present invention. According to this structure, a decoder comprises deformatter 202 which extracts spectral information and information establishing the inverse transform length from the encoded digital signal received from path 200, inverse pre-transform 206 which generates frequency-domain transform coefficients from the extracted spectral information and in response to the information establishing the inverse transform length received along path 204, inverse transform 208 which transforms the frequency-domain transform coefficients into time-domain transform coefficients by applying a transform whose length is adapted in response to information received from path 204, inverse post-transform 210 which generates signal samples f rom the time-domain transform coefficients and in response to information received from path 204, and output processor 212 which generates along path 214 output samples corresponding to the input samples to a companion encoder in response to the signal samples. The functions performed by deformatter 202 and output processor 212 are not discussed in detail herein.
It should be appreciated from a study of the following disclosure and the accompanying claims that some elements shown in Figures 1 and 2 are not required to practice various embodiments of the present invention.
A basic embodiment of the present invention is introduced in some detail before alternative embodiments are discussed. This basic embodiment uses fixed-length E-TDAC transforms to implement the analysis and synthesis filter banks. Preferred embodiments of various features are described throughout the discussion. II. Basic Embodiment of Invention
A. Input Sample Buffering
A buffer, represented by box 102 in Figure 1, receives signal samples and groups them into a sequence of time-domain signal sample blocks. Each block comprises N signal samples. The signal samples may be received from the sampling of an analog signal, from the generation of samples
representing or simulating an analog signal, or from any other source of discrete-valued samples which correspond to a time-domain signal.
It is well known in the art that the frequency-resolving power or selectivity of a filter bank implemented by a discrete transform improves as the transform length increases. It is also well known that filter selectivity may be affected significantly by weighting the time-domain signal samples by a weighting function commonly called a window. See generally, Harris, "On the Use of Windows for Harmonic Analysis with the Discrete Fourier Transform," Proc. IEEE, vol. 66, January, 1978, pp. 51-83.
The E-TDAC transform used in the basic embodiment of the present invention requires window weighting, both weighting of the time-domain signal samples in an encoder prior to forward transform filtering, referred to as analysis windowing, and weighting of the recovered time-domain signal samples in a decoder after inverse transform filtering, referred to as synthesis windowing. Analysis-and synthesis-window weighting are discussed below only briefly. It is assumed herein that the buffered signal samples are weighted by an analysis window as may be required or desired. Input signal samples may be weighted by an analysis window prior to or subsequent to their receipt by the buffer without departing from the scope of the present invention.
B. Analysis Filter Bank - Forward Transform
Although the forward pre-transform function discussed below is applied to time-domain signal sample blocks prior to application of the forward transform, it is necessary to introduce the forward transform before the forward pre-transform function can be fully described. The forward transform is represented by box 108 in Figure 1.
The E-TDAC transform used in the basic embodiment of the present invention is equivalent to the alternate application of a Modified Discrete Cosine Transform (MDCT) with a Modified Discrete Sine Transform (MDST). The MDCT and the MDST, shown in equations 1 and 2 respectively, are
where k = frequency-domain transform coefficient number,
n = time-domain signal sample number,
N = time-domain signal sample block length,
m = phase term required for TDAC (see equation 6),
x(n) = time-domain signal sample n,
C(k) = MDCT frequency-domain transform coefficient k, and
S(k) = MDST frequency-domain transform coefficient k.
The E-TDAC transform produces one of two alternating sets of frequency-domain transform coefficients in response to each time-domain signal sample block. These sets of frequency-domain transform coefficients are of the form
where i = time-domain signal sample block number. Each set of coefficients generated by the MDCT and the MDST are referred to herein as MDCT coefficient sets and MDST coefficient sets, respectively.
Princen and Bradley showed that with the proper phase term m and a suitable pair of analysis- synthesis windows, the E-TDAC technique can accurately recover an input signal from an alternating sequence of overlapped fixed-length MDCT coefficient sets and MDST coefficient sets of the form
{C(k)}0, {S(k)}1, {C(k)}2, {S(k)}3, . . . . (5)
Using only the alternate MDCT coefficient sets and MDST coefficient sets produces a time-domain aliasing component, but the aliasing component may be cancelled by choosing the appropriate phase term m for equations 1 and 2, applying the forward transform to overlapped analysis-window weighted time-domain signal sample blocks, and by synthesis-window weighting and adding adjacent overlapped time-domain signal sample blocks recovered by the inverse transform.
The phase term m in equations 1 and 2 controls the phase shift of the time-domain aliasing distortion. To cancel this alias distortion and accurately recover the original time-domain signal, E-TDAC requires the aliasing to be as follows: for the MDCT, the time-domain alias component consists of the first half of the sampled and windowed signal reversed in time about the one-quarter point of the sample block and the second half of the sampled and windowed signal reversed in time about the three-quarter point of the sample block; for the MDST, the alias component is similar to that for the MDCT except its amplitude is inverted in sign. These relationships are illustrated in Figure 5 in which the time-domain aliasing component, shown by a broken line, and the desired signal, shown by a solid line, have been weighted by a synthesis window.
The phase term required to produce the appropriate aliasing components for alias cancellation is
C. Forward Pre-Transform Function
The processing requirements of the technique used to evaluate the MDCT and the MDST may be reduced by applying a forward pre-transform function to the time-domain signal sample blocks to produce blocks of modified samples, and applying a ½N-point DCT and a ½N-point DST to the modified sample blocks for the N-point MDCT and MDST, respectively. The forward pre-transform function is represented by box 106 in Figure 1. For E-TDAC, the pre-transform function combines pairs of signal samples in each time-domain signal sample block of length N to produce a block of modified samples of length ½N.
The mathematical basis for using a ½N-point DCT applied to modified samples to perform the N-point MDCT may be seen by first substituting equation 6 for the phase term m into equation 1. The MDCT in equation 1 may be expressed as
By setting and substituting it into expression 7a, by setting and substituting
it mto expressions 7b and 7c, and by setting and substituting it into expression 7d, it may
be seen that equation 1 can be rewritten as
Finally, by defining a new sequence
where [i] mod M represents the value of i modulo M , the expressions 8a and 8b may be combined and written as
which is a ½N-point DCT for y(n).
From a similar derivation, it can be shown that the MDST of length N can be implemented by a DST of length ½N;
where
It should be appreciated that the forward pre-transform function for this basic embodiment, as well as for alternative embodiments discussed below, can be performed by any of several implementations including software-controlled processors and circuits capable of combining pairs of time-domain signal samples to form modified samples. One flowgraph for a 16-sample block is shown in Figure 3 which illustrates the forward pre-transform functions of equations 9 and 12 for a basic embodiment of the present invention. The minus signs shown within parenthesis denote terms which are
subtractively combined with an associated sample for the function shown above in equation 12. This subtractive combination may be accomplished in circuitry, for example, by negating the value of the signal sample representations corresponding to the nodes in Figure 3 with minus signs in parenthesis and additively combining the resultant representations.
D. Forward Post-Transform Function
In principle, the forward transform in the basic embodiment of the present generates frequency-domain transform coefficients in response to an input signal which are equivalent to the coefficients generated by an E-TDAC transform applied to the same input signal. Some alternative embodiments of the present invention described below require application of a forward post-transform function to the coefficients generated by the forward transform in order to obtain spectral information equivalent to transform coefficients generated by a corresponding TDAC transform.
If an application does not require spectral information, then an encoder incorporating any embodiment of the present invention need not apply a forward post-transform function to the frequency-domain transform coefficients generated by the forward transform. For example, the
frequency-domain transform coefficients themselves may be directly transmitted or stored and subsequently transmitted to a corresponding receiver for decoding.
For many applications, however, spectral information is required. For example, encoder/decoder systems which exploit psychoacoustic principles to reduce coded signal information requirements usually require spectral information in order to estimate psychoacoustic masking effects of a signal's spectral components.
Frequency-domain transform coefficients generated by the forward transform and spectral information generated by the various forward post-transform functions described below are generally not suitable for low bit-rate transmission or efficient storage. Various quantization techniques may be used to reduce informational requirements by taking advantage of a signal's irrelevancy.
In a practical implementation of an encoder incorporating a basic embodiment of the present invention, the forward post-transform function represented by box 110 in Figure 1 may comprise quantizing the frequency-domain transform coefficients generated by the forward transform; however, quantizing is not required to practice the present invention.
E. Output Formatting
Output formatting represented by box 112 in Figure 1 is not required to practice the present invention but is often used in signal encoding applications. Generally, output formatting assembles the spectral information and other information required for transmission or storage. Any additional side-information needed by a decoder is also assembled into the formatted signal. Frame
synchronization bits and error detection/correction codes may be used as needed for transmission. Database pointers or keys may be added as needed for storage. The formatted data is ready for transmission or for storage along path 114 shown in Figure 1. F. Input Deformatting
Input deformatting represented by box 202 in Figure 2 is not required to practice the present invention but is often used in signal decoding applications. Deformatting extracts spectral information and any side information from a formatted signal received from path 200 either by receipt of a transmitted signal or retrieved from storage.
G. Inverse Pre-Transform Function
The inverse pre-transform function, represented by box 206 in Figure 2, obtains frequency-domain transform coefficients from the spectral information in the received signal. If the spectral information in the received signal substantially corresponds to the frequency-domain transform coefficients generated by an E-TDAC transform, then the inverse pre-transform function in a basic embodiment of the present function may be a trivial or essentially null function such as, for example, grouping spectral information into blocks.
In a practical implementation of a decoder incorporating a basic embodiment of the present invention, the inverse pre-transform function may comprise dequantizing the encoded digital information into a form suitable for input to the inverse transform filter bank; however, dequantizing is not required to practice the present invention.
H. Synthesis Filter Bank - Inverse Transform
Box 208 in Figure 2 represents a bank of synthesis filters which transforms each set of frequency- domain transform coefficients into time-domain transform coefficients. A transform inverse to that used in analysis filter bank 108 in Figure 1 implements synthesis filter bank 208. The inverse discrete transforms for E-TDAC used in the basic embodiment of the present invention is an alternating application of an Inverse Modified Discrete Cosine Transform (TMDCT) and an Inverse Modified Discrete Sine Transform (IMDST) shown in equations 13 and 14, respectively;
where C(k) = recovered MDCT frequency-domain transform coefficient k,
§(k) = recovered MDST frequency-domain transform coefficient k, and
x(n) = recovered time-domain signal sample n.
I. Inverse Post-Transform Function
The processing requirements of the technique used to evaluate the IMDCT and the IMDST may be reduced by instead evaluating an Inverse DCT (IDCT) and an Inverse DST (IDST) and applying an inverse post-transform function after application of the inverse transforms. This inverse post-transform function is represented by box 210 in Figure 2.
For E-TDAC, the inverse post-transform function splits time-domain transform coefficients into signal samples. Using a derivation similar to that discussed above for the forward transform, it can be shown that, with an appropriate inverse post-transform function discussed below, the TMDCT of length N can be implemented by an IDCT of length ½N;
where $(n)— recovered time-domain transform coefficient n, and
Recovered time-domain signal samples may be obtained from the time-domain transform coefficients
according to
With an appropriate inverse post-transform function, the IMDST of length N can be implemented by an IDST of length %N;
where = recovered time-domain transform coefficient n.
Recovered time-domain signal samples may be obtained from the time-domain transform coefficients
according to
J. Output Sample Processing
An overlap-add process is required by the TDAC transforms to generate samples corresponding to signal samples encoded by a companion encoder. This process, represented by box 212 in Figure 2, overlaps adjacent blocks of recovered time-domain samples and adds the samples in one block to samples in the adjacent overlapped block.
The E-TDAC transform used in the basic embodiment of the present invention also requires prior to overlap-add the application of a synthesis window to the recovered time-domain sample blocks. The constraints the E-TDAC transform places upon the design of the synthesis window, the analysis window, and the overlap-add process is discussed fully in the paper by Princen and Bradley referred to above.
III. Alternative Fixed-Length Embodiments
Alternative embodiments of the present invention may achieve greater reductions in processing requirements. The following description discusses the differences between these alternative embodiments and the basic embodiment described above.
A. E-TDAC Implemented by DFT
In one alternative embodiment of the present invention for an encoder, the forward E-TDAC transform is implemented by a Discrete Fourier Transform (DFT).
A forward pre-transform function generates an alternating sequence of two types of blocks comprising modified samples; one block type comprising modified samples p(n) and a second block type comprising modified samples r(n). Each modified sample is formed from the combination of one pair of signal samples x(n) according to
A flowgraph for a 16-sample block illustrating this forward pre-transform function is shown in Figure 4.
The forward E-TDAC transform is implemented by a DFT which generates alternating sets of complex-valued frequency-domain transform coefficients P(k) of the form T(k)+j.U(k) and R(k) of the form V(k)+j. W(k) in response to the alternating sequence of modified sample blocks;
where j =√-1.
Spectral information corresponding to E-TDAC transform coefficients C(k) and S(k) is obtained by applying a forward post-transform function according to;
In one alternative embodiment of the present invention for a decoder, the inverse E-TDAC transform is implemented by an Inverse DFT (JDFT).
An inverse pre-transform function recovers spectral information and corresponding to
E-TDAC transform coefficients C(k) and S(k), respectively, from the encoded signal and generates in response to the recovered spectral information an alternating sequence of two types of blocks comprising recovered frequency-domain transform coefficients; one block type comprises recovered complex-valued coefficients
of the form and a second block type comprises
recovered complex-valued coefficients of the form . The real and imaginary parts
of the frequency-domain transform coefficients are obtained according to
An inverse transform generates an alternating sequence of two types of blocks comprising recovered time-domain transform coefficients by applying an IDFT to the alternating sequence of frequency-domain transform coefficient blocks; one block type comprises recovered time-domain transform coefficients
and a second block type comprises recovered time-domain transform coefficients
The IDFT used to recover the time-domain transform coefficients is shown in equations 31 and 32;
Recovered time-domain signal samples are obtained by applying an inverse post-transform
function to the alternating sequence of blocks comprising recovered time-domain transform coefficients. Signal samples are obtained from blocks comprising the coefficients according to
Signal samples are obtained from blocks comprising the
coefficients according to
B. E-TDAC Implemented by Concurrent DFT
In another embodiment of the present invention for an encoder, the MDCT and the MDST of one or more forward E-TDAC transforms are implemented concurrently by one or more DFTs. In single channel encoder applications, two adjacent frequency-domain coefficient sets as illustrated in expression 5 above may be generated concurrently by a single DFT. In two channel applications, a MDCT coefficient set for channel one may be generated concurrently with a MDST coefficient set for channel two, immediately followed by a MDST coefficient set for channel one generated concurrently with a MDCT coefficient set for channel two. Other combinations of coefficient sets for concurrent processing are possible. For additional detail on concurrent transforms, see generally, Brigham, The Fast Fourier Transform. Englewood Cliffs, ΝJ: Prentice-Hall, Inc., 1974, pp. 166-67.
A forward pre-transform function generates a sequence of blocks comprising complex-valued modified samples q(n) of the form p(n)+j. r(n) where p(n) and r(n) are formed from the application of the forward pre-transform function described above and shown in equations 21 and 22.
The MDCT and the MDST constituting the forward E-TDAC transform are concurrently implemented by a DFT which generates complex-valued frequency-domain transform coefficients Q(k) of the form G(k)+j.H(k) according to
Spectral information corresponding to E-TDAC transform coefficients C(k) and S(k) is obtained by applying the forward post-transform functions according to
In another embodiment of the present invention for a decoder, the IMDCT and the IMDST of one or more inverse E-TDAC transforms are implemented concurrently by one or more JDFTs.
An inverse pre-transform function recovers spectral information
and
corresponding to E-TDAC transform coefficients C(k) and S(k), respectively, from the encoded signal and generates in response to the recovered spectral information a sequence of blocks comprising recovered complexvalued frequency-domain transform coefficients of the form where and
are obtained from recovered spectral information according to
The IMDCT and the IMDST constituting the inverse E-TDAC transform are concurrently implemented by an IDFT which generates complex-valued time-domain transform coefficients Q of
the form
according to
Time-domain signal samples x(n) are recovered from the application of the inverse post-transform function described above and shown in equations 33 through 36.
IV. Adaptive-Length Embodiments
A. Bridge Transform
As mentioned above, it is desirable that the technique which improves transform processing efficiency should also permit adaptive selection of the transform length. The means and
considerations required to implement an adaptive-transform-length coder are not discussed here, but are discussed in International Patent Application PCT/US 91/02512, Publication No. WO 91/16769 (published October 31, 1991).
Changes in the length of either the E-TDAC transform or the O-TDAC transform may require changes in the phase term m in order to realize time-domain aliasing cancellation. Figure 5 is a hypothetical graphical representation of two adjacent overlapped N-sample length time-domain signal sample blocks recovered by an inverse E-TDAC transform, one block recovered from the IMDCT and the second block recovered from the IMDST after synthesis windowing but before overlap-add of the adjacent blocks has cancelled time-domain aliasing. The representation in this and other figures does not show individual signal samples, but rather illustrates only the envelope of the amplitude of samples within the windowed signal sample blocks.
Each recovered signal sample block comprises two components: one component represented in the figures by a solid line substantially corresponds to the analysis- and synthesis-window weighted input signal samples, and the second component represented in the figures by a broken line corresponds to the analysis- and synthesis-window weighted time-domain aliasing distortion. As discussed above, the aliasing component is a time-reversed replica of the windowed input signal samples which occurs in two separate regions. The phase term m for the E-TDAC and the O-TDAC transforms controls the location of the boundary between these two regions. For fixed-length E-TDAC and O-TDAC
transforms, the boundary is located at the mid-point of the signal sample block. The phase term required for time-domain aliasing cancellation under this condition is shown in equation 6.
Figure 6 is a hypothetical graphical representation of three time-domain signal sample blocks recovered from an inverse E-TDAC transform prior to overlap-add. The first block is an N-sample length block which has been recovered from the IMDCT. The second and third blocks are ½N-sample length blocks which have been recovered from the IMDST. The aliasing component in the N-sample length MDCT block comprises a replica of the first half of the signal sample block reversed in time about the one-quarter point of the sample block, and a replica of the second half of the sampled signal reversed in time about the three-quarter point of the sample block. If overlap-add of the second half of the MDCT block and the first MDST ½N-sample length block shown in Figure 6 is to cancel time-domain aliasing, the time-domain aliasing component in the first MDST ½N-sampIe length block must be a replica of the entire ½N-sample length block inverted in sign and time-reversed end-for-end. The phase term m required by the MDST and IMDST transforms to produce a time-domain aliasing component with these characteristics is m = ½.
It can be shown that the phase term may be written generally as
where Φ is the location of the boundary between time-reversal regions, expressed as the number of time-domain signal samples from the right-hand or trailing edge of the time-domain signal sample block.
For example, Figure 7 illustrates two window-weighted time-domain signal sample blocks. The right-hand block is ¼N samples in length. Within this block, the boundary between time-reversal regions is at a point Ν/8 samples from the right-hand or trailing edge of the block. Thus, the phase term m required to cause time-reversal of the aliasing component within each region of the
With this background established, it is now possible to introduce the "bridge transform. " A bridge transform is a transform which bridges a shift from one transform length to another. For example, as shown in Figure 8, suppose the present invention is called upon to process one block of ½N samples followed by another block of ¼N samples. It is possible to perform a separate transform for each block. For reasons whose explanation exceed the scope of this discussion, a bridge transform improves coder performance by instead transforming a single block of ¾N samples.
The bridge transform required to process the ¾N-sample block shown in Figure 8 may be implemented by an FFT to compute the transform for three ¾N blocks followed by a recombination
operation. This technique is known in the art and is discussed in Oppenheim and Schafer, Digital Signal Processing. Englewood Cliffs, N.J.: Prentice-Hall, Inc., 1975, pp. 307-14. The FFT with this recombination operation can also be used to concurrently process two E-TDAC bridge transforms in the same manner as that briefly discussed above for fixed-length transforms. It is important to note, however, that concurrent processing in E-TDAC is possible only for a MDCT and MDST which have the same length and TDAC phase term.
For the decoder, the length of the inverse transform may be established by side-information passed by the encoder in the encoded signal. The same considerations for adaptive-length transforms and bridge transforms that were discussed above for the forward transform, including the phase term required for time-domain aliasing cancellation, also apply to the inverse transform.
The following describes differences between adaptive-length embodiments of the present invention and the fixed-length embodiments discussed above. The structure of each adaptive-length
embodiment is substantially the same as that for a corresponding fixed-length embodiment. The most significant differences pertain to the pre- and post-transform functions and to the length and phase terms of the transform functions.
In the following discussions, each time-domain signal sample block is defined to be a+b samples in length, overlapping the immediately prior block by a samples and overlapping the immediately subsequent block by b samples. It is assumed that the number of samples in the two overlap intervals may vary from block to block. According to the conventions established in the previous discussion, the bridge transform applied to each time-domain signal sample block is an adaptive-length
(a+b)-point transform.
B. E-TDAC Implemented by DCT/DST
One adaptive-length embodiment corresponds to the fixed-length basic embodiment discussed above. The forward pre-transform, corresponding to the functions shown in equations 9 and 12 of the fixed-length embodiment, generates an alternating sequence of modified sample blocks according to
A flowgraph illustrating this forward pre-transform function for a 16-sample block with a=4 and b=-12 is shown in Figure 9.
The forward transform comprises a DCT and a DST according to
The inverse transform comprises an IDCT and an IDST according to
The inverse post-transform function, corresponding to the functions shown in equations 16, 17, 19, and 20 for the fixed-length embodiment, recovers time-domain signal samples from time-domain transform coefficients according to
C. E-TDAC Implemented by DFT
Another adaptive-length embodiment corresponds to the fixed-length embodiment of the E-TDAC transform implemented by a DFT, discussed above. The forward pre-transform, corresponding to the functions shown in equations 21 and 22 for the fixed-length embodiment, generates an alternating sequence of modified sample blocks according to
The forward E-TDAC transform is implemented by a DFT which generates alternating sets of complex-valued frequency-domain transform coefficients P(k) of the form T(k)+jU(k) and R(k) of the form V(k)+yW(k) in response to the alternating sequence of modified sample blocks according to
The forward post-transform, corresponding to the functions shown in equations 25 and 26 for the fixed-length embodiment, generates alternating sets of spectral information according to
The inverse pre-transform function, corresponding to the functions shown in equations 27 through 30 for the fixed-length embodiment, generates an alternating sequence of blocks comprising recovered frequency-domain transform coefficients; one block type comprises recovered complex-valued coefficients
of the form
and a second block type comprises recovered complex-valued coefficients
of the form . Each frequency-domain transform coefficient is
The inverse transform generates an alternating sequence of two types of blocks comprising recovered time-domain transform coefficients by applying an IDFT to the alternating sequence of frequency-domain transform coefficient blocks; one block type comprises recovered time-domain transform coefficients
and a second block type comprises recovered time-domain transform coefficients
. The IDFT used to obtain the recovered time-domain transform coefficients is shown in equations 64 and 65;
The inverse post-transform function, corresponding to the functions shown in equations 33 through
36 for the fixed-length embodiment, obtains recovered time-domain signal samples from recovered time-domain transform coefficients according to
where 0≤ n < a+b.
D. O-TDAC Implemented by DST
The O-TDAC transform utilizes a MDCT of the form
where E(k) = frequency-domain transform coefficient k.
The processing requirements needed to implement this transform can be reduced by applying a forward pre-transform function to the time-domain signal samples to generate modified samples e(n), then applying a DST to the modified samples to generate frequency-domain transform coefficients X(k). The forward pre-transform function is
A flowgraph illustrating this forward pre-transform function for a 16-sample block with a=4 and b=12 is shown in Figure 10. The minus signs denote terms which are subtractively combined with
an associated sample for the function shown above in equation 71.
The forward transform comprises a DST according to
The inverse transform comprises an IDST according to
where = recovered time-domain transform coefficient, and
= recovered frequency-domain transform coefficient k.
The inverse post-transform function obtains recovered time-domain signal samples
from recovered time-domain transform coefficients according to