WO2009117967A1 - 编码、解码的方法及装置 - Google Patents
编码、解码的方法及装置 Download PDFInfo
- Publication number
- WO2009117967A1 WO2009117967A1 PCT/CN2009/071030 CN2009071030W WO2009117967A1 WO 2009117967 A1 WO2009117967 A1 WO 2009117967A1 CN 2009071030 W CN2009071030 W CN 2009071030W WO 2009117967 A1 WO2009117967 A1 WO 2009117967A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- frame
- superframe
- background noise
- current
- coding
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/012—Comfort noise or silence coding
Definitions
- the present application claims priority to Chinese Patent Application No. 200810084077.6, the entire disclosure of which is incorporated herein by reference. .
- TECHNICAL FIELD The present invention relates to the field of communications technologies, and in particular, to a method and an apparatus for encoding and decoding.
- the codec for background noise is performed according to the noise processing scheme specified in G.729B established by the ITU (International Telecommunications Union).
- a silent compression technique is introduced in the speech coder, and its signal processing principle block diagram is shown in FIG. 1 .
- the mute compression technology mainly includes three modules: VAD (Voice Activity Detection), DTX (Discontinuous Transmission), and CNG (Comfort Noise Generator), where VAD and DTX are in the encoder.
- Module, CNG is the module in the decoding end.
- Figure 1 is a block diagram of a simple silent compression system.
- the VAD module analyzes and detects the current input signal to detect whether the current signal is in the current signal. Contains voice signals, if included, sets the current frame as a speech frame, otherwise it is set to a non-speech frame.
- the encoder encodes the current signal according to the VAD detection result. If the VAD detection result is a speech frame, the signal enters the speech encoder for speech encoding, and the output is a speech frame; if the VAD detection result is a non-speech frame, the signal enters the DTX.
- the module performs background noise processing with a non-speech encoder and outputs non-speech frames.
- the received signal frame (including the speech frame and the non-speech frame) is decoded at the receiving end (decoding end). If the received signal frame is a speech frame, it is decoded by a speech decoder, otherwise it enters the CNG module, and the CNG module decodes the background noise according to the parameters transmitted from the non-speech frame to generate comfortable background noise or mute, so that decoding The latter signal sounds more natural and continuous. Introducing this variable rate coding method in the encoder, by adapting the signal in the silent phase When the encoding, the mute compression technology effectively solves the problem of background noise discontinuity and improves the signal synthesis quality. Therefore, the background noise at the decoding end can also be called comfort noise.
- the average coding rate of the system is also greatly reduced, thereby effectively saving bandwidth.
- G.729B processes the signal, the signal is processed by framing, and the frame length is 10ms. In order to save bandwidth,
- G.729.1 also defines the requirements of the silent compression system, which is required to encode and transmit the background noise in the case of background noise without degrading the overall coding quality of the signal, ie, DTX and CNG are defined.
- the more important requirement is to require its DTX/CNG system to be compatible with G.729B.
- G.729B's DTX/CNG system can be easily ported to G.729.1, there are two problems to be solved: First, the processing lengths of the two encoders are different, direct migration will bring some problems, and 729B
- the DTX/CNG system is somewhat simple, especially the parameter extraction part.
- the 729B DTX/CNG system needs to be extended.
- the signal bandwidth processed by G.729.1 is broadband, and the bandwidth processed by G.729B is narrowband.
- the high-band portion of the background noise signal (4000Hz ⁇ 7000Hz) is also added. Make it a complete system.
- the existing G.729B system has a narrow bandwidth background noise, and the quality of the encoded signal cannot be guaranteed when transplanted into the G.729.1 system.
- an object of one or more embodiments of the present invention is to provide a method and an apparatus for encoding and decoding, which can implement the requirements of the G.729.1 technical standard after extending G.729B.
- the communication bandwidth of the signal is significantly reduced.
- an embodiment of the present invention provides a coding method, including:
- a decoding method comprising: obtaining a CNG parameter of a first frame of a first superframe from a speech encoded frame preceding a first frame of a first superframe; and according to the CNG parameter, for the first The first frame of the superframe performs background noise decoding, the CNG parameters including: a target excitation gain determined by a fixed codebook gain quantized by a long time smoothed speech coded frame; an LPC filter coefficient, the LPC Filter coefficients are quantized by long time smoothed speech coded frames
- An encoding device including: a first extracting unit, configured to: extract a background noise characteristic parameter in a trailing time;
- a second coding unit configured to: after the first superframe after the tailing time, according to the extracted background noise characteristic parameter in the trailing time and the background noise characteristic parameter of the first superframe, Performing background noise coding; a second extracting unit, configured to: perform background noise feature parameter extraction on each frame after the superframe after the first superframe;
- a DTX decision unit configured to: perform a DTX decision on each frame after the first superframe; and a third coding unit, configured to: a superframe after the first superframe, Background noise coding is performed according to the background noise characteristic parameter of the extracted current superframe and the background noise characteristic parameters of several superframes before the current superframe, and the final DTX decision result.
- a decoding apparatus comprising: a CNG parameter obtaining unit, configured to: obtain a CNG parameter of a first frame of a first superframe from a voice coded frame before a first frame of a first superframe; a first decoding unit, configured to: perform background noise decoding on a first frame of the first superframe according to the CNG parameter, where the CNG parameter includes: a target excitation gain, where the target excitation gain is smoothed by a long time Fixed codebook gain determination for coded frame quantization;
- the embodiment of the invention has the following advantages:
- the embodiment of the present invention extracts the background noise characteristic parameter in the trailing time; and the first superframe after the trailing time, according to the extracted background noise characteristic parameter and the background noise of the first superframe Characteristic parameters, performing background noise coding; for the superframe after the first superframe, performing background noise feature parameter extraction and DTX decision for each frame; for the superframe after the first superframe, according to the extracted current superframe.
- the background noise characteristic parameter of the frame and the background noise characteristic parameters of several superframes before the current superframe, and the final DTX decision result perform background noise coding. Achieved:
- the communication bandwidth of the signal is significantly reduced in the case of ensuring the quality of the coding.
- FIG. 1 shows a block diagram of a simple silent compression system
- Figure 2 shows the functional block diagram of the G.729.1 encoder
- Figure 3 shows the G.729.1 decoder system block diagram
- FIG. 5 is a schematic flow chart of encoding the first superframe
- FIG. 6 is a flow chart of narrowband partial parameter extraction and DTX decision
- Shown, is a flowchart of the background noise parameter extraction and DTX decision of the narrowband part in the current superframe
- Figure 8 is a flow chart showing a first embodiment of the decoding method of the present invention
- Figure 9 is a block diagram showing a first embodiment of the encoding apparatus of the present invention
- Figure 10 is an implementation of the decoding apparatus of the present invention.
- the block diagram of the first example is a schematic flow chart of encoding the first superframe
- FIG. 6 is a flow chart of narrowband partial parameter extraction and DTX decision
- Shown is a flowchart
- the synthesis filter parameters are mainly line spectrum frequency LSF quantization parameters
- the excitation signal parameters include: pitch delay parameters, pitch gain parameters, fixed codebook parameters, and fixed codebook gain parameters.
- the quantization bit number and the quantization form of these parameters are different; the same encoder, if it contains multiple rates, at different rates, the quantization bits of the coding parameters are different due to the different emphasis of the description signal characteristics The number and quantization form are also different.
- the background noise coding parameters describe the background noise characteristics. Since the excitation signal of the background noise can be regarded as a simple random sequence of noise, these sequences can be simply generated by the random noise generation module at the codec end.
- the excitation signal characteristic parameters can be simply represented by energy parameters without further description of other characteristic parameters, so the background noise coding is performed.
- the excitation parameter is the energy parameter of the current background noise frame, which is different from the speech frame; the same as the speech frame, the synthesis filter parameter in the background noise coded stream is also the line spectrum frequency LSF quantization parameter, but The specific methods of quantification are different.
- the mute compression scheme of G.729B is an early silent compression technology.
- the algorithm model based on background noise codec technology is CELP, so the background noise parameters transmitted by it are also extracted based on CELP model, which is to describe background noise.
- the synthesis filter parameters and excitation parameters wherein the excitation parameters are energy parameters describing the background noise energy, the adaptive and fixed codebook parameters of the speech excitation are not described, and the filter parameters are basically consistent with the speech coding parameters, which are LSF parameters.
- the encoder sends the signal to the DTX module, and the background noise parameter is extracted in the DTX module.
- the background noise is encoded: if the filter parameters and energy parameters extracted by the current frame and the previous frames change greatly, then the current background noise characteristic is compared with the previous background noise characteristic. For larger differences, the noise encoding module encodes the background noise parameters extracted by the current frame, and assembles them into a SID frame (Sience Insertion Descriptor) to the decoding end, otherwise sends a NODATA frame (no data) to the decoding end. . SID frames and NODATA frames are called non-speech frames. At the decoding end, if the background noise phase is entered, comfort noise describing the background noise characteristics of the encoding end is synthesized in the CNG module according to the received non-speech frames.
- SID frame Ses Insertion Descriptor
- G.729B processes the signal
- the signal is processed by framing, and the frame length is 10ms.
- the 729B DTX, noise coding and CNG modules are described in three sections below.
- the DTX module is mainly used to estimate and quantize the background noise parameters and send SID frames.
- the DTX module needs to send background noise information to the decoding end, and the background noise information is encapsulated and sent in the SID frame. If the current background noise is not smooth, the SID frame is sent, otherwise the SID frame is not sent, and no transmission is performed.
- the NODATA frame of the data The interval between the other two adjacent SID frames is limited, and is limited to two frames. If the background noise is not stable and the SID frame needs to be continuously transmitted, the transmission of the latter SID frame is delayed.
- the DTX module receives the VAD module's output, autocorrelation coefficients, and past excitation samples from the encoder.
- the DTX module uses three values 0, 1, and 2 to describe the non-transmitted frames, respectively.
- the content of the background noise estimation is the energy level of the background noise and the spectral envelope. This is consistent with the speech coding parameters. Therefore, the calculation of the spectral envelope and the calculation of the speech coding parameters are basically the same.
- the parameters used include the former.
- the parameters of the two frames; and the energy parameter is also an average of the energy of the first few frames.
- the Levinson-Durbin algorithm will calculate the residual energy. And use this as a simple estimate of the frame excitation energy.
- the frame type of the current frame is estimated in the following way:
- the algorithm compares the previous SID frame parameters with the current corresponding parameters, if the current filter differs from the previous filter or the current excitation energy is compared with the previous excitation energy. Large, then the flag 3 ⁇ 4g_ c to «g e is equal to 1, otherwise the value of the flag does not change.
- the current counter count_fr represents the number of frames between the current frame and the previous SID. If the value is greater than N mm , then the SID frame is sent; in addition, if flag_change is equal to 1, the SID frame is also sent. In other cases, the current frame is not sent:
- R a (0) ⁇ a sid (kf d, frame energy:
- the parameters in the SID frame are the LPC filter coefficients (spectral envelope) and the quantization parameters of the energy.
- the stability between adjacent noise frames is considered in the calculation of the SID-LPC filter: First, the average LPC filter (z) of the frame before the current SID frame is calculated, which uses the autocorrelation function and ( ), It will then be sent to the Levinson-Durbin algorithm to get 0), which is expressed as:
- the algorithm calculates the average LPC filter coefficients of the first few frames (and then compares it with the current LPC filter coefficient 4 (if the difference between the two is small, then the current frame is selected when the LPC coefficients are quantized)
- the average of a few frames (otherwise, the current frame is 4 (after selecting the LPC filter coefficients, the algorithm converts these LPC filter coefficients into the LSF domain, then performs quantitative encoding, and the quantization coding is selected in a manner that is encoded with speech.
- the quantization coding method is the same.
- the quantization of the energy parameters is done in the logarithmic domain, using linear quantization, and then encoding with 5 bits.
- the encoding of the background noise is completed, and then the coding bits are encapsulated in the SID frame.
- Table A Table A
- the parameters in the SID frame consist of four codebook indices, one for indicating the energy quantization index (5 bits) and the other three for indexing the spectral quantization (10 bits).
- the algorithm uses a level-controllable pseudo white noise to excite an interpolated LPC synthesis filter to obtain comfortable background noise, which is essentially the same as speech synthesis.
- the excitation level and the LPC filter coefficient are respectively obtained from the previous SID frame.
- the LPC filter coefficients of the subframe are obtained by interpolation of the LSP parameters in the SID frame, and the interpolation method is consistent with the interpolation method in the speech coder.
- the pseudo white noise excitation ex(n) is a mixture of the speech excitation exl(n) and the Gaussian white noise excitation ex2(n).
- the gain of exl(n) is small, and the purpose of exl(n) is to make the transition between speech and non-speech more natural.
- the 80 sample points are divided into two sub-frames.
- the excitation signal of the CNG module is synthesized in the following manner:
- the synthetic stimulus ex(" can be synthesized as follows:
- G.729.1 is the latest release of the new generation of speech codec standards (see reference [1]), which is an extension of 111; 0.729 on 8-321 ⁇ /8 scalable broadband (50-70001 ⁇ ).
- the input frequency of the input and decoder outputs is 16000 Hz.
- the code stream generated by the encoder is scalable, and includes 12 embedded layers, which are called layers 1-12.
- the first layer is the core layer, and the corresponding bit rate is 8 kbit/s. This layer is consistent with the G.729 code stream, which makes G.729EV and G.729 interoperable.
- the second layer is a narrowband enhancement layer, which is increased by 4 kbit/s, while the third to 12th layers are broadband enhancement layers, which are increased by 20 kbit/s at a rate of 2 kbit/s per layer.
- the G.729.1 codec is based on a three-stage architecture: Embedded Code Excited Linear Estimation (CELP) codec, Time Domain Bandwidth Extension (TDBWE), and Estimated Conversion Codec, known as Time Domain Aliasing Elimination (TDAC).
- CELP Embedded Code Excited Linear Estimation
- TDBWE Time Domain Bandwidth Extension
- TDAC Time Domain Aliasing Elimination
- the embedded CELP stage produces Layers 1 and 2, producing 8 kbit/s and 12 kbit/s narrowband composite signals (50-4000 Hz).
- Stage 3 TDBWE generating layer generates Mkbit / s wideband output signal (5 0- 7 000 Hz).
- the TDAC phase works in the improved discrete cosine transform (MDCT) domain to generate layers 4-12, improving signal quality from 14 kbit/s to 32 kbit/s.
- the TDAC codec represents both a 50-4000 Hz band weighted CELP codec error signal and a
- the encoder operates in a 20 ms input superframe.
- the input signal ( «) is sampled at 16000 Hz. Therefore, the input superframe has 320 sample lengths.
- the input signal 3 ⁇ 4» is QMF filtered (H ⁇ H ( ) is divided into two sub-bands, and the low sub-band signal is preprocessed by a high-pass filter with a cutoff frequency of 50 Hz.
- the output signal ( «) uses 8 kb/s to 12 kb/s.
- the difference signal between the local composite signals ⁇ ; ⁇ ) of the CELP encoder is d», which is subjected to perceptual weighting filtering to obtain a signal ("), and (") is transformed into the frequency domain by MDCT.
- the weighting filter W LB (z) contains gain compensation to maintain the spectral continuity between the filter output d» and the high subband input signal.
- the high sub-band component is multiplied by (-1)" to obtain the signal ⁇ after folding, and the ⁇ » is preprocessed by a low-pass filter with a cutoff frequency of 3000 Hz, and the filtered signal is encoded using a TDBWE encoder.
- the MDCT is transformed into a frequency domain signal.
- the two sets of MDCT coefficients / and ⁇ are finally encoded using a TDAC encoder.
- some parameters are transmitted using an FEC (Frame Loss Error Concealed) encoder to improve frame loss during transmission. The error caused by it.
- FEC Full Loss Error Concealed
- the block diagram of the decoder system is shown in Figure 3.
- the actual mode of operation of the decoder is determined by the number of code streams received, and is also equivalent to the received code rate.
- the code stream of the first layer or the first two layers is decoded by the embedded CELP decoder.
- the output signal is generated by a QMF synthesis filter bank, wherein the high frequency composite signal ⁇ is set to zero.
- the TDBWE decoder In addition to the CELP decoder decoding the narrowband component, the TDBWE decoder also decodes the highband signal component s ( «). For MDCT transformation, the high sub-band component is above 3000Hz (corresponding to the 16kHz sampling rate)
- the low-band signal (") is processed via the perceptual weighting filter.
- forward/backward echo monitoring and compression are performed on the low- and high-band signals » and ⁇ .
- the signal ⁇ (") is processed by post-filtering, and the high-band composite signal ⁇ (") is processed by (-l) n-frequency folding.
- G.729.1 also defines the requirements of the silent compression system, which requires the background code to be encoded and transmitted with low-rate coding mode without degrading the overall coding quality of the signal in the case of background noise.
- the demand for DTX and CNG more importantly, requires that its DTX/CNG system be compatible with G.729B.
- G.729B's DTX/CNG system can be easily ported to G.729.1, there are two problems to be solved: First, the processing lengths of the two encoders are different, direct migration will bring some problems, and 729B The DTX/CNG system is somewhat simple, especially the parameter extraction part.
- the 729B DTX/CNG system needs to be extended.
- the signal bandwidth processed by G.729.1 is broadband, and the bandwidth processed by G.729B is narrowband.
- the high-band portion of the background noise signal (4000Hz ⁇ 7000Hz) is also added. Make it a complete system.
- the high and low bands of background noise can be processed separately.
- the processing method of the high frequency band is relatively simple, and the coding mode of the background noise characteristic parameter can refer to the TDBWE coding mode of the speech encoder, and the decision part can simply compare the stability of the frequency domain envelope and the time domain envelope.
- the technical solution of the present invention and the problem to be solved are in the low frequency band, that is, the narrow band.
- the G.729.1 DTX/CNG system referred to below refers to the related processing applied to the narrowband DTX/CNG part.
- Step 401 Extract background noise characteristic parameters in a trailing time
- Step 402 Perform background noise coding according to the extracted background noise characteristic parameter of the trailing time and the background noise characteristic parameter of the first superframe for the first superframe after the tailing time. Code, get the first SID frame;
- Step 403 Perform background noise feature parameter extraction and DTX decision on each frame for the superframe after the first superframe.
- Step 404 Perform background noise on the superframe after the first superframe, according to the background noise characteristic parameter of the extracted current superframe, the background noise characteristic parameter of several superframes before the current superframe, and the final DTX decision result. coding.
- the background noise characteristic parameter in the trailing time is extracted; and the first superframe after the trailing time is based on the extracted background noise characteristic parameter and the first
- the background noise characteristic parameter of a superframe is subjected to background noise coding; for the superframe after the first superframe, background noise characteristic parameter extraction and DTX decision are performed for each frame;
- background noise coding is performed according to the background noise characteristic parameter of the extracted current superframe and the background noise characteristic parameters of several superframes before the current superframe, and the final DTX decision result. Achieved:
- the communication bandwidth of the signal is significantly reduced in the case of ensuring the quality of the coding.
- the frame included in each superframe may be set to 10 milliseconds by setting each superframe to 20 milliseconds in order to accommodate the requirements of the related technical standards of G.729.1.
- the extension to G.729B can be achieved to meet the technical specifications of G.729.1.
- the technical solutions provided by the various embodiments of the present invention can also achieve the lower frequency band occupation of the background noise. High communication quality. That is, the scope of application of the present invention is not limited to the G.729.1 system.
- the present invention mainly describes the DTX/CNG system of G729.1 for this difference, that is, by upgrading and expanding the G729B DTX/CNG system to adapt to the system characteristics of ITU729.1.
- the first 120 ms of the background noise is encoded with the speech coding rate
- the background noise is not immediately entered.
- the background noise is continued to be encoded with the speech coding rate.
- This tailing time is generally 6 superframes, which is 120ms (refer to AMR and AMRWB).
- the duration of noise learning can be set according to actual needs, not limited to 120ms; the tailing time can be set to other values as needed.
- FIG. 5 it is a schematic diagram of the process of coding the first superframe, including the steps: performing the first superframe after the end of the smearing phase, and performing the background noise characteristic parameters extracted from the noise learning phase and the current superframe.
- the first SID superframe is obtained. Since the first superframe after the smear phase is to be encoded and transmitted with the background noise parameter, this superframe is generally referred to as the first SID superframe; The first SID superframe is decoded after being sent to the decoder. Since one superframe corresponds to two 10ms frames, in order to accurately obtain the coding parameters, the characteristic parameters 4 of the background noise are extracted in the second 10ms frame (and £,:
- Step 501 Calculate an average of all autocorrelation coefficients in the cache:
- the estimated residual energy A can be smoothed for a long time and smoothed.
- E t E_LT where "the value range is: 0 ⁇ « ⁇ 1, as a preferred embodiment, "the value may be 0.9. It can also be set to other values as needed.
- Step 503 The algorithm converts the LPC filter coefficient 4 (to the LSF domain, and then performs quantization coding;
- Step 504 The quantization of the residual energy parameter A is performed in the logarithmic domain, and linear quantization is used. After the encoding of the narrowband portion of the background noise is completed, the encoded bits are enclosed in the SID frame and transmitted to the decoding end, thus completing the encoding of the narrowband portion of the first SID frame.
- the encoding of the narrowband portion of the first SID frame fully considers the characteristics of the background noise in the trailing phase, and reflects the characteristics of the background noise in the tailing phase in the encoding parameters, thereby making these encoding parameters Maximizes the characteristics of the current background noise. Therefore, parameter extraction in the embodiment of the present invention is more accurate and reasonable than G.729B.
- FIG. 6 it is a flowchart of narrowband partial parameter extraction and DTX decision, including the steps of: first, performing background noise parameter extraction and DTX decision of the first 10 millisecond frame after the first superframe;
- Step 601 According to the nearest four adjacent 10 ms frame autocorrelation coefficients r (t _ l) 2 (j) . ⁇ ( ⁇ _ ⁇ ) ⁇ (]) and r _ 2 2 (values, calculate the steady-state average R' of the current autocorrelation coefficient (j):
- the algorithm estimates the estimated frame energy in order to obtain a more stable
- E_LT ⁇ oE_LT+ ( ⁇ -a)E tl
- Step 603 After the parameter is extracted, perform a DTX decision of the current 10 ms frame; the specific content of the DTX decision is:
- the algorithm will use the previous SID superframe (the SID superframe is the background noise superframe that will be finally encoded after the DTX decision. If the DTX decision result, the superframe is not sent, it is not called the SID superframe).
- the parameter is compared with the corresponding encoding parameter of the current 10 millisecond frame, if the current LPC filter coefficient is significantly different from the LPC filter coefficient in the previous SID superframe, or the current energy parameter is different from the energy parameter in the previous SID superframe. Larger (see the formula below), the parameter change flag flag_change_first of the current 10ms frame is set to 1, otherwise cleared.
- the specific determination method in this step is similar to G.729B:
- Flag _ change _ first 0
- R a (0) ⁇ a sid (kf Secondly, calculate the average of the residual energy of four 10ms frames for the current 10ms frame and the last three 10ms frames:
- the difference between the two excitation energies can be set to other values according to actual needs, which does not exceed the protection scope of the present invention.
- the background noise parameter extraction and DTX decision of the second 10 ms frame are performed.
- the background noise parameter extraction and DTX decision flow of the second 10ms frame is consistent with the first 1 Oms frame, wherein the relevant parameters of the second 10ms frame are: Steady-state average R U of the adjacent four 10ms frame autocorrelation coefficients /) , the average of 2 adjacent 10ms frame frame energy 2 and the DTX flag of the second 10ms frame flag_change_second.
- the background noise parameter extraction and DTX decision of the narrowband part in the current superframe are: Steady-state average R U of the adjacent four 10ms frame autocorrelation coefficients /) , the average of 2 adjacent 10ms frame frame energy 2 and the DTX flag of the second 10ms frame flag_change_second.
- FIG. 7 it is a narrowband part background noise parameter extraction and DTX decision flow diagram in the current superframe, including steps:
- Step 701 Determine a final DTX flag flag_change of a narrowband portion of the current superframe, where the determining manner is as follows:
- Flag _ change flag _ change _ first 11 flag _ change _ sec ond
- the final decision result of the narrowband portion of the current superframe is 1.
- Step 702 Determine a final DTX decision result of the current superframe; and obtain a final DTX decision result of the current superframe including the current superframe high frequency band portion, and then consider a characteristic of the high frequency band portion, by a narrowband portion and a high frequency The band part combines the final DTX decision result of the current superframe. If the final DTX decision result of the current superframe is 1, proceed to step 703; if the DTX decision result of the current superframe is 0, no encoding is performed, and only the NODATA frame without any data is sent to the decoding end.
- Step 703 If the final DTX decision result of the current superframe is 1, extracting the background noise characteristic parameter of the current superframe; extracting the source of the background noise characteristic parameter of the current superframe is the parameter of the current two 1 Oms frames, The parameters of the current two 1 Oms frames are smoothed to obtain the background noise coding parameters of the current superframe.
- the process includes: First, calculating two 10ms frame autocorrelation coefficients
- E smooth _ rateE t j+(l - smooth _rate) E t 2
- the background noise feature parameter extraction and DTX control fully rely on the characteristics of each 10ms frame of the current superframe, so the algorithm is more rigorous. 5.
- the encoding of the SID frame is the same as that of G.729B. When the spectral parameters of the SID frame are finally encoded, the adjacent noise frames are considered. The stability of the situation, the specific operation and G.729B -
- the algorithm will calculate the average LPC filter coefficients of the first few superframes (and then use it to compare with the current LPC filter coefficient 4 (if the difference between the two is small, then the current superframe is The average of the first few superframes is selected when the LPC coefficients are quantized (otherwise, it is 4 of the current superframe).
- the specific comparison method is the same as the DTX decision of the 10ms frame in step 602, where t/?r3 is specific.
- the threshold value is generally between 1.0 and 1.5, which is 1.0966466 in this embodiment. Those skilled in the art can take other values according to actual needs, which does not exceed the protection scope of the present invention.
- the algorithm After selecting the LPC filter coefficients, the algorithm converts these LPC filter coefficients into the LSF domain and then performs quantization coding, and the quantization coding selection is similar to the G.729B quantization coding method.
- the quantification of the energy parameters is done in the logarithmic domain, using linear quantization and then encoding. This encodes the background noise and then encapsulates the encoded bits in the SID frame. Sixth, the way of CNG
- the decoding process is also included in the coding end, and the CNG system is no exception, that is, the coding end also includes CNG in G.729.1.
- the processing flow is based on G.729B.
- the frame length is 20ms
- the background noise is processed with a data processing length of 10ms.
- the encoding parameters of the first SID superframe will be encoded in the second 10ms frame, but the system needs to generate CNG in the first 10ms frame of the first SID superframe. Parameters.
- the CNG parameter of the first 10 ms frame of the first SID superframe cannot be obtained from the coding parameters of the SID superframe, but only from the previous speech coding superframe. Due to this special case, the CNG mode of the first 10 ms frame of the first SID superframe of G.729.1 is different from that of G.729B, compared with the CNG mode of G.729B introduced in the foregoing. Different performances are:
- Target excitation gain Fixed codebook gain quantized by long-time smoothed speech coded superframes Definition:
- LT _A(z) LT _A(z) + ( ⁇ - )A q (z)
- the smoothing factor has a value range of 0 ⁇ 1, which is 0.5 in this embodiment.
- the CNG mode of all other 10ms frames is consistent with G.729B.
- the trailing time is 120 milliseconds or 140 milliseconds.
- the background noise characteristic parameter in the extraction tailing time is specifically: in the trailing time, the autocorrelation coefficient of the background noise of each frame is saved for each frame of each superframe. .
- background noise coding for the first superframe after the smear time, the background noise characteristic parameter according to the extracted smear time and the background noise characteristic of the first superframe Parameters, background noise coding include:
- the extracting the LPC filter coefficients is specifically: calculating four superframes in the trailing time before the first superframe and the first superframe The average of the autocorrelation coefficients;
- the extracting the residual energy A is specifically:
- the residual energy is linearly quantized in the log domain.
- the value of the background noise characteristic parameter is extracted for each frame of the superframe after the first superframe in the above embodiment.
- the background noise LPC filter coefficients and residual energy are calculated according to the Levinson-durbin algorithm.
- the method further includes:
- the smoothing mode is:
- E _LT aE _LT ⁇ + ( ⁇ -a)E tk -
- the smoothed current frame energy estimate is assigned to the residual energy; the assignment method is:
- the parameter change flag of the current 10 millisecond frame is set to zero.
- the energy estimation of the current frame is significantly different from the energy estimation in the previous SID superframe. Calculating an average value of residual energy of a total of 4 frames of the current 10 millisecond frame and the previous 3 frames as an energy estimate of the current frame;
- the performing DTX decision for each frame is specifically as follows: If the DTX decision result of one frame in the current superframe is 1, the DTX decision result of the narrowband portion of the current superframe is 1.
- the final DTX decision result of the current superframe is 1, then: "for the superframe after the first superframe, according to the background noise characteristic parameter of the extracted current superframe.
- the background noise characteristic parameters of the plurality of superframes before the current superframe, and the final DTX decision result, performing background noise coding" processes include:
- determining a smoothing factor including:
- the smoothing factor is 0.1, otherwise the smoothing factor is 0.5;
- parameter smoothing on the two frames of the current superframe, and using the parameter smoothed parameter as a feature parameter for performing background noise coding on the current superframe, where the parameter smoothing includes:
- Rt (j) smooth rateR" ( )+(l - smooth rate)R t (j) , the smoothing rate is the smoothing factor, and is the steady-state average value of the autocorrelation coefficient of the first frame, ' 2 ( is the steady-state average of the autocorrelation coefficients of the second frame;
- the LPC filter coefficients are obtained according to the Levinson-Durbin algorithm.
- the “background noise coding is performed according to the background noise characteristic parameter of the extracted current superframe and the background noise characteristic parameter of several superframes before the current superframe, and the final DTX decision result. for: Calculating an average of autocorrelation coefficients of several superframes before the current superframe;
- the average LPC filter coefficient and the LPC filter coefficient difference of the current superframe are less than or equal to a preset value, converting the average LPC filter coefficient into an LSF domain, performing quantization coding; if the average LPC filtering The difference between the LPC filter coefficient of the current superframe and the current superframe is greater than a preset value, and the LPC filter coefficients of the current superframe are converted into an LSF domain for quantization coding; for the energy parameter, linear quantization is performed in a logarithmic domain coding.
- the number of the several frames is 5. Those skilled in the art can also select other numbers of frames as needed.
- the method before the step of extracting the background noise characteristic parameter in the trailing time, the method further includes:
- the background noise during the trailing time is encoded with a speech coding rate.
- FIG. 8 it is a first embodiment of the decoding method of the present invention, including the steps:
- Step 801 Obtain a CNG parameter of the first frame of the first superframe from the voice coded frame before the first frame of the first superframe.
- Step 802 Perform background noise decoding on the first frame of the first superframe according to the CNG parameter, where the CNG parameters include:
- the target excitation gain being determined by a fixed codebook gain quantized by a long time smoothed speech encoded frame parameter
- the long-term smoothing factor takes a value ranging from greater than 0 to less than 1.
- the long-term smoothing factor may be 0.5.
- the above 0.4.
- the first embodiment of the encoding apparatus of the present invention includes: a first extracting unit 901, configured to: extract a background noise characteristic parameter in a trailing time; and a second encoding unit 902, configured to: a first superframe after the trailing time, performing background noise encoding according to the extracted background noise characteristic parameter of the trailing time and the background noise characteristic parameter of the first superframe;
- a second extracting unit 903 configured to: perform background noise feature parameter extraction on each frame for the superframe after the first superframe;
- the DTX decision unit 904 is configured to: perform a DTX decision on each frame for the superframe after the first superframe;
- a third encoding unit 905 configured to:: a superframe after the first superframe, a background noise characteristic parameter of the extracted current superframe, and a background noise characteristic parameter of the plurality of superframes before the current superframe, and a final DTX
- the result of the decision is to encode the background noise.
- the trailing time is 120 milliseconds or 140 milliseconds.
- the first extracting unit is specifically configured to: a cache module, configured to: save, in the trailing time, an autocorrelation coefficient of each frame of background noise for each frame of each superframe.
- the second coding unit is specifically: An extraction module, configured to: save an autocorrelation coefficient of each frame of background noise in the first frame and the second frame; and an encoding module, configured to: in the second frame, according to the extracted autocorrelation coefficients of the two frames And the background noise characteristic parameter in the trailing time, extracting the LPC filter coefficient and the residual energy of the first superframe, and performing background noise coding.
- the second coding unit may further include: a residual energy smoothing module, configured to: perform long-term smoothing on the residual energy;
- the second extraction unit is specifically:
- a first calculating module configured to: calculate a steady state average value of the current autocorrelation coefficient according to a value of a correlation coefficient of the last four adjacent frames, where a steady state average value of the autocorrelation coefficient is the nearest four neighbors The average of the autocorrelation coefficients of the two frames with the intermediate autocorrelation coefficient norm in the frame;
- a second calculation module is configured to: calculate the background noise LP C filter coefficients and residual energy according to the Levinson-durbin algorithm for the steady state average.
- the second extraction unit may further include:
- a second residual energy smoothing module configured to: perform long-term smoothing on the residual energy to obtain a current frame energy estimate; and the smoothing manner is:
- E _LT aE _LT ⁇ + ( ⁇ -a)E tk -
- the smoothed current frame energy estimate is assigned to the residual energy; the assignment method is:
- the DTX decision unit is specifically:
- a threshold comparison module configured to: generate a decision instruction if a value of a current frame LPC filter coefficient and a previous SID superframe LPC filter coefficient exceed a preset threshold
- An energy comparison module configured to: calculate an average value of residual energy of a total frame of four frames of the current frame and the previous three frames as an energy estimate of the current frame, and use an average value of the residual energy to quantify the amount of the quantizer If the difference between the decoded logarithm energy and the logarithmic energy of the previous SID superframe is greater than a preset value, generating a decision instruction; the first determining module is configured to: according to the decision instruction, the current frame Parameter change flag set to
- the foregoing embodiment may further include: a second determining unit, configured to: if a DTX decision result of one frame in the current superframe is 1, the DTX decision result of the narrowband portion of the current superframe is 1;
- the third coding unit is specifically configured to: a smoothing indication module, configured to: if the final DTX decision result of the current superframe is 1, generate a smoothing instruction; and a smoothing factor determining module, configured to: receive the smoothing instruction After determining the smoothing factor of the current superframe:
- the parameter smoothing module is configured to:
- the two frames are subjected to parameter smoothing, and the smoothed parameter is used as a characteristic parameter for performing background noise encoding on the current superframe, and includes: calculating a moving average of the steady-state average values of the autocorrelation coefficients of the two frames (:
- R' (j) smooth _ rateR t )+(l - smooth _ rate) ⁇ ' 2 (j) , the smoothing-rate is the smoothing factor, ⁇ /) is the autocorrelation coefficient steady state of the first frame The average value, ' 2 ( ) is the steady-state average of the autocorrelation coefficients of the second frame;
- the third coding unit is specifically: a third calculating module, configured to: calculate an average LPC filter coefficient of the plurality of superframes before the current superframe according to the average value of the autocorrelation coefficients of the plurality of superframes before the current superframe; And if the difference between the average LPC filter coefficient and the LPC filter coefficient of the current superframe is less than or equal to a preset value, converting the average LPC filter coefficient into an LSF domain, performing quantization coding; For: if the average LPC filter coefficient and the LPC filter coefficient difference of the current superframe are greater than a preset value, converting the LPC filter coefficients of the current superframe into an LSF domain, performing quantization coding; An encoding module for: performing linear quantization coding on the energy parameter in the logarithmic domain.
- a first coding unit configured to: encode, by using a speech coding rate, background noise in a trailing time; the coding process of the present invention is specifically adapted to the coding method of the present invention, and correspondingly, has a corresponding method The same technical effects of the embodiment.
- FIG. 10 it is a first embodiment of the decoding apparatus of the present invention, including:
- LPC filter coefficient is defined by a long-time smoothed speech coded frame quantized LPC filter coefficient, wherein, in practical use, the defined LPC filter coefficient may be specifically:
- LPC filter coefficient long time smoothed speech coded frame quantized LPC filter coefficients.
- the long-term smoothing factor ranges from greater than 0 to less than 1. In a preferred case, the long-term smoothing factor may be 0.5. In the foregoing embodiment, the method may further include:
- a second decoding unit configured to: perform background noise coding according to the acquired CNG after acquiring CNG parameters from the previous SID superframe for all frames except the first superframe.
- the 0.4.
Description
Claims
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP09726234.9A EP2224428B1 (en) | 2008-03-26 | 2009-03-26 | Coding methods and devices |
US12/820,805 US8370135B2 (en) | 2008-03-26 | 2010-06-22 | Method and apparatus for encoding and decoding |
US12/881,926 US7912712B2 (en) | 2008-03-26 | 2010-09-14 | Method and apparatus for encoding and decoding of background noise based on the extracted background noise characteristic parameters |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2008100840776A CN101335000B (zh) | 2008-03-26 | 2008-03-26 | 编码的方法及装置 |
CN200810084077.6 | 2008-03-26 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/820,805 Continuation US8370135B2 (en) | 2008-03-26 | 2010-06-22 | Method and apparatus for encoding and decoding |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2009117967A1 true WO2009117967A1 (zh) | 2009-10-01 |
Family
ID=40197557
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2009/071030 WO2009117967A1 (zh) | 2008-03-26 | 2009-03-26 | 编码、解码的方法及装置 |
Country Status (7)
Country | Link |
---|---|
US (2) | US8370135B2 (zh) |
EP (1) | EP2224428B1 (zh) |
KR (1) | KR101147878B1 (zh) |
CN (1) | CN101335000B (zh) |
BR (1) | BRPI0906521A2 (zh) |
RU (1) | RU2461898C2 (zh) |
WO (1) | WO2009117967A1 (zh) |
Families Citing this family (50)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP4368575B2 (ja) | 2002-04-19 | 2009-11-18 | パナソニック株式会社 | 可変長復号化方法、可変長復号化装置およびプログラム |
KR101291193B1 (ko) | 2006-11-30 | 2013-07-31 | 삼성전자주식회사 | 프레임 오류은닉방법 |
CN101246688B (zh) * | 2007-02-14 | 2011-01-12 | 华为技术有限公司 | 一种对背景噪声信号进行编解码的方法、系统和装置 |
JP2009063928A (ja) * | 2007-09-07 | 2009-03-26 | Fujitsu Ltd | 補間方法、情報処理装置 |
DE102008009719A1 (de) * | 2008-02-19 | 2009-08-20 | Siemens Enterprise Communications Gmbh & Co. Kg | Verfahren und Mittel zur Enkodierung von Hintergrundrauschinformationen |
DE102008009720A1 (de) * | 2008-02-19 | 2009-08-20 | Siemens Enterprise Communications Gmbh & Co. Kg | Verfahren und Mittel zur Dekodierung von Hintergrundrauschinformationen |
CN101335000B (zh) * | 2008-03-26 | 2010-04-21 | 华为技术有限公司 | 编码的方法及装置 |
US20100114568A1 (en) * | 2008-10-24 | 2010-05-06 | Lg Electronics Inc. | Apparatus for processing an audio signal and method thereof |
US8442837B2 (en) * | 2009-12-31 | 2013-05-14 | Motorola Mobility Llc | Embedded speech and audio coding using a switchable model core |
PT2559028E (pt) * | 2010-04-14 | 2015-11-18 | Voiceage Corp | Livro de códigos de inovação combinados flexível e evolutivo a utilizar num codificador e descodificador celp |
US20130268265A1 (en) * | 2010-07-01 | 2013-10-10 | Gyuhyeok Jeong | Method and device for processing audio signal |
CN101895373B (zh) * | 2010-07-21 | 2014-05-07 | 华为技术有限公司 | 信道译码方法、系统及装置 |
EP2458586A1 (en) * | 2010-11-24 | 2012-05-30 | Koninklijke Philips Electronics N.V. | System and method for producing an audio signal |
JP5724338B2 (ja) * | 2010-12-03 | 2015-05-27 | ソニー株式会社 | 符号化装置および符号化方法、復号装置および復号方法、並びにプログラム |
JP2013076871A (ja) * | 2011-09-30 | 2013-04-25 | Oki Electric Ind Co Ltd | 音声符号化装置及びプログラム、音声復号装置及びプログラム、並びに、音声符号化システム |
KR102138320B1 (ko) | 2011-10-28 | 2020-08-11 | 한국전자통신연구원 | 통신 시스템에서 신호 코덱 장치 및 방법 |
CN103093756B (zh) * | 2011-11-01 | 2015-08-12 | 联芯科技有限公司 | 舒适噪声生成方法及舒适噪声生成器 |
CN103137133B (zh) * | 2011-11-29 | 2017-06-06 | 南京中兴软件有限责任公司 | 非激活音信号参数估计方法及舒适噪声产生方法及系统 |
US20130155924A1 (en) * | 2011-12-15 | 2013-06-20 | Tellabs Operations, Inc. | Coded-domain echo control |
CN103187065B (zh) | 2011-12-30 | 2015-12-16 | 华为技术有限公司 | 音频数据的处理方法、装置和系统 |
US9065576B2 (en) | 2012-04-18 | 2015-06-23 | 2236008 Ontario Inc. | System, apparatus and method for transmitting continuous audio data |
CN104603874B (zh) * | 2012-08-31 | 2017-07-04 | 瑞典爱立信有限公司 | 用于语音活动性检测的方法和设备 |
MY185490A (en) | 2012-09-11 | 2021-05-19 | Ericsson Telefon Ab L M | Generation of comfort noise |
CA2948015C (en) | 2012-12-21 | 2018-03-20 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Comfort noise addition for modeling background noise at low bit-rates |
AU2013366642B2 (en) | 2012-12-21 | 2016-09-22 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Generation of a comfort noise with high spectro-temporal resolution in discontinuous transmission of audio signals |
ES2905846T3 (es) | 2013-01-29 | 2022-04-12 | Fraunhofer Ges Forschung | Aparato y procedimiento para generar una señal de frecuencia reforzada mediante la suavización temporal de las subbandas |
PL3471093T3 (pl) * | 2013-01-29 | 2021-04-06 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Wypełnianie szumem w perceptualnym transformatowym kodowaniu audio |
EP3550562B1 (en) * | 2013-02-22 | 2020-10-28 | Telefonaktiebolaget LM Ericsson (publ) | Methods and apparatuses for dtx hangover in audio coding |
EP2981963B1 (en) | 2013-04-05 | 2017-01-04 | Dolby Laboratories Licensing Corporation | Companding apparatus and method to reduce quantization noise using advanced spectral extension |
CN106169297B (zh) | 2013-05-30 | 2019-04-19 | 华为技术有限公司 | 信号编码方法及设备 |
WO2014202539A1 (en) | 2013-06-21 | 2014-12-24 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Apparatus and method for improved concealment of the adaptive codebook in acelp-like concealment employing improved pitch lag estimation |
PL3011555T3 (pl) * | 2013-06-21 | 2018-09-28 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Rekonstrukcja ramki sygnału mowy |
US9418671B2 (en) * | 2013-08-15 | 2016-08-16 | Huawei Technologies Co., Ltd. | Adaptive high-pass post-filter |
EP3057377B1 (en) * | 2013-11-07 | 2021-12-29 | Huawei Technologies Co., Ltd. | Network device, terminal device and voice service control method |
SG11201606512TA (en) * | 2014-01-15 | 2016-09-29 | Samsung Electronics Co Ltd | Weight function determination device and method for quantizing linear prediction coding coefficient |
CN111312277B (zh) | 2014-03-03 | 2023-08-15 | 三星电子株式会社 | 用于带宽扩展的高频解码的方法及设备 |
US10157620B2 (en) * | 2014-03-04 | 2018-12-18 | Interactive Intelligence Group, Inc. | System and method to correct for packet loss in automatic speech recognition systems utilizing linear interpolation |
EP3913628A1 (en) * | 2014-03-24 | 2021-11-24 | Samsung Electronics Co., Ltd. | High-band encoding method |
JP6035270B2 (ja) * | 2014-03-24 | 2016-11-30 | 株式会社Nttドコモ | 音声復号装置、音声符号化装置、音声復号方法、音声符号化方法、音声復号プログラム、および音声符号化プログラム |
CN104978970B (zh) * | 2014-04-08 | 2019-02-12 | 华为技术有限公司 | 一种噪声信号的处理和生成方法、编解码器和编解码系统 |
US9572103B2 (en) * | 2014-09-24 | 2017-02-14 | Nuance Communications, Inc. | System and method for addressing discontinuous transmission in a network device |
CN105846948B (zh) * | 2015-01-13 | 2020-04-28 | 中兴通讯股份有限公司 | 一种实现harq-ack检测的方法及装置 |
WO2016142002A1 (en) * | 2015-03-09 | 2016-09-15 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Audio encoder, audio decoder, method for encoding an audio signal and method for decoding an encoded audio signal |
CN106160944B (zh) * | 2016-07-07 | 2019-04-23 | 广州市恒力安全检测技术有限公司 | 一种超声波局部放电信号的变速率编码压缩方法 |
US11670308B2 (en) | 2018-06-28 | 2023-06-06 | Telefonaktiebolaget Lm Ericsson (Publ) | Adaptive comfort noise parameter determination |
CN115132214A (zh) | 2018-06-29 | 2022-09-30 | 华为技术有限公司 | 立体声信号的编码、解码方法、编码装置和解码装置 |
CN109490848B (zh) * | 2018-11-07 | 2021-01-01 | 国科电雷(北京)电子装备技术有限公司 | 一种基于两级信道化的长短雷达脉冲信号检测方法 |
US10784988B2 (en) | 2018-12-21 | 2020-09-22 | Microsoft Technology Licensing, Llc | Conditional forward error correction for network data |
US10803876B2 (en) * | 2018-12-21 | 2020-10-13 | Microsoft Technology Licensing, Llc | Combined forward and backward extrapolation of lost network data |
CN112037803B (zh) * | 2020-05-08 | 2023-09-29 | 珠海市杰理科技股份有限公司 | 音频编码方法及装置、电子设备、存储介质 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0785541B1 (en) * | 1996-01-22 | 2003-04-16 | Rockwell International Corporation | Usage of voice activity detection for efficient coding of speech |
US6711537B1 (en) * | 1999-11-22 | 2004-03-23 | Zarlink Semiconductor Inc. | Comfort noise generation for open discontinuous transmission systems |
CN1513168A (zh) * | 2000-11-27 | 2004-07-14 | ��˹��ŵ�� | 话音通信中产生舒适噪声的方法和系统 |
EP1288913B1 (en) * | 2001-08-31 | 2007-02-21 | Fujitsu Limited | Speech transcoding method and apparatus |
CN101335000A (zh) * | 2008-03-26 | 2008-12-31 | 华为技术有限公司 | 编码、解码的方法及装置 |
Family Cites Families (27)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CA2020899C (en) * | 1989-08-18 | 1995-09-05 | Nambirajan Seshadri | Generalized viterbi decoding algorithms |
JP2877375B2 (ja) * | 1989-09-14 | 1999-03-31 | 株式会社東芝 | 可変レートコーデックを用いたセル転送方式 |
JP2776094B2 (ja) * | 1991-10-31 | 1998-07-16 | 日本電気株式会社 | 可変変調通信方法 |
US5559832A (en) * | 1993-06-28 | 1996-09-24 | Motorola, Inc. | Method and apparatus for maintaining convergence within an ADPCM communication system during discontinuous transmission |
JP3090842B2 (ja) * | 1994-04-28 | 2000-09-25 | 沖電気工業株式会社 | ビタビ復号法に適応した送信装置 |
US5742734A (en) * | 1994-08-10 | 1998-04-21 | Qualcomm Incorporated | Encoding rate selection in a variable rate vocoder |
FI105001B (fi) * | 1995-06-30 | 2000-05-15 | Nokia Mobile Phones Ltd | Menetelmä odotusajan selvittämiseksi puhedekooderissa epäjatkuvassa lähetyksessä ja puhedekooderi sekä lähetin-vastaanotin |
US5774849A (en) * | 1996-01-22 | 1998-06-30 | Rockwell International Corporation | Method and apparatus for generating frame voicing decisions of an incoming speech signal |
US6269331B1 (en) | 1996-11-14 | 2001-07-31 | Nokia Mobile Phones Limited | Transmission of comfort noise parameters during discontinuous transmission |
US5960389A (en) * | 1996-11-15 | 1999-09-28 | Nokia Mobile Phones Limited | Methods for generating comfort noise during discontinuous transmission |
KR100389853B1 (ko) | 1998-03-06 | 2003-08-19 | 삼성전자주식회사 | 카타로그정보의기록및재생방법 |
SE9803698L (sv) * | 1998-10-26 | 2000-04-27 | Ericsson Telefon Ab L M | Metoder och anordningar i ett telekommunikationssystem |
EP1133886B1 (en) * | 1998-11-24 | 2008-03-12 | Telefonaktiebolaget LM Ericsson (publ) | Efficient in-band signaling for discontinuous transmission and configuration changes in adaptive multi-rate communications systems |
FI116643B (fi) | 1999-11-15 | 2006-01-13 | Nokia Corp | Kohinan vaimennus |
KR100312335B1 (ko) | 2000-01-14 | 2001-11-03 | 대표이사 서승모 | 음성부호화기 중 쾌적 잡음 발생기의 새로운 sid프레임 결정방법 |
US6687668B2 (en) * | 1999-12-31 | 2004-02-03 | C & S Technology Co., Ltd. | Method for improvement of G.723.1 processing time and speech quality and for reduction of bit rate in CELP vocoder and CELP vococer using the same |
US6631139B2 (en) * | 2001-01-31 | 2003-10-07 | Qualcomm Incorporated | Method and apparatus for interoperability between voice transmission systems during speech inactivity |
US7031916B2 (en) * | 2001-06-01 | 2006-04-18 | Texas Instruments Incorporated | Method for converging a G.729 Annex B compliant voice activity detection circuit |
US7099387B2 (en) * | 2002-03-22 | 2006-08-29 | Realnetorks, Inc. | Context-adaptive VLC video transform coefficients encoding/decoding methods and apparatuses |
US7613607B2 (en) * | 2003-12-18 | 2009-11-03 | Nokia Corporation | Audio enhancement in coded domain |
EP1897085B1 (en) * | 2005-06-18 | 2017-05-31 | Nokia Technologies Oy | System and method for adaptive transmission of comfort noise parameters during discontinuous speech transmission |
US7610197B2 (en) * | 2005-08-31 | 2009-10-27 | Motorola, Inc. | Method and apparatus for comfort noise generation in speech communication systems |
US8260609B2 (en) * | 2006-07-31 | 2012-09-04 | Qualcomm Incorporated | Systems, methods, and apparatus for wideband encoding and decoding of inactive frames |
US7573907B2 (en) * | 2006-08-22 | 2009-08-11 | Nokia Corporation | Discontinuous transmission of speech signals |
US8032359B2 (en) | 2007-02-14 | 2011-10-04 | Mindspeed Technologies, Inc. | Embedded silence and background noise compression |
PL2118889T3 (pl) * | 2007-03-05 | 2013-03-29 | Ericsson Telefon Ab L M | Sposób i sterownik do wygładzania stacjonarnego szumu tła |
US8315756B2 (en) * | 2009-08-24 | 2012-11-20 | Toyota Motor Engineering and Manufacturing N.A. (TEMA) | Systems and methods of vehicular path prediction for cooperative driving applications through digital map and dynamic vehicle model fusion |
-
2008
- 2008-03-26 CN CN2008100840776A patent/CN101335000B/zh active Active
-
2009
- 2009-03-26 KR KR1020107016392A patent/KR101147878B1/ko active IP Right Grant
- 2009-03-26 WO PCT/CN2009/071030 patent/WO2009117967A1/zh active Application Filing
- 2009-03-26 BR BRPI0906521A patent/BRPI0906521A2/pt not_active IP Right Cessation
- 2009-03-26 RU RU2010130664/08A patent/RU2461898C2/ru active
- 2009-03-26 EP EP09726234.9A patent/EP2224428B1/en active Active
-
2010
- 2010-06-22 US US12/820,805 patent/US8370135B2/en active Active
- 2010-09-14 US US12/881,926 patent/US7912712B2/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0785541B1 (en) * | 1996-01-22 | 2003-04-16 | Rockwell International Corporation | Usage of voice activity detection for efficient coding of speech |
US6711537B1 (en) * | 1999-11-22 | 2004-03-23 | Zarlink Semiconductor Inc. | Comfort noise generation for open discontinuous transmission systems |
CN1513168A (zh) * | 2000-11-27 | 2004-07-14 | ��˹��ŵ�� | 话音通信中产生舒适噪声的方法和系统 |
EP1288913B1 (en) * | 2001-08-31 | 2007-02-21 | Fujitsu Limited | Speech transcoding method and apparatus |
CN101335000A (zh) * | 2008-03-26 | 2008-12-31 | 华为技术有限公司 | 编码、解码的方法及装置 |
Non-Patent Citations (4)
Title |
---|
"An 8-32 kbit/s scalable wideband coder bitstream interoperable with G729", ITU-T RECOMMENDATION G.729.1 (EX G.729EV) : G729-BASED EMBEDDED VARIABLE BIT-RATE CODER, May 2006 (2006-05-01), pages 3 - 9 * |
"Technical Specification Group Services and System Aspects; Mandatory Speech Codec speech processing functions; AMR Speech Codec; Comfort noise aspects", 3GPP TS 26.092 V4.0.0, 3RD GENERATION PARTNERSHIP PROJECT, March 2001 (2001-03-01), pages 7 - 9 * |
ITU-T RECOMMENDATION G.729 ANNEX B: A SILENCE COMPRESSION SCHEME FOR G729 OPTIMIZED FOR TERMINALS CONFORMING TO RECOMMENDATION V70, November 1996 (1996-11-01), pages 9 - 15 * |
JIAO C. ET AL.: "A New Wideband Speech CODEC AMR-WB", COMPUTER SIMULATION, vol. 22, no. 1, January 2005 (2005-01-01), pages 150 - 152 * |
Also Published As
Publication number | Publication date |
---|---|
US8370135B2 (en) | 2013-02-05 |
EP2224428A4 (en) | 2011-01-12 |
EP2224428A1 (en) | 2010-09-01 |
KR101147878B1 (ko) | 2012-06-01 |
BRPI0906521A2 (pt) | 2019-09-24 |
US20100280823A1 (en) | 2010-11-04 |
CN101335000A (zh) | 2008-12-31 |
RU2010130664A (ru) | 2012-05-10 |
RU2461898C2 (ru) | 2012-09-20 |
US20100324917A1 (en) | 2010-12-23 |
CN101335000B (zh) | 2010-04-21 |
EP2224428B1 (en) | 2015-06-10 |
US7912712B2 (en) | 2011-03-22 |
KR20100105733A (ko) | 2010-09-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2009117967A1 (zh) | 编码、解码的方法及装置 | |
US8630864B2 (en) | Method for switching rate and bandwidth scalable audio decoding rate | |
US8532983B2 (en) | Adaptive frequency prediction for encoding or decoding an audio signal | |
KR101425944B1 (ko) | 디지털 오디오 신호에 대한 향상된 코딩/디코딩 | |
JP4270866B2 (ja) | 非音声のスピーチの高性能の低ビット速度コード化方法および装置 | |
US9672840B2 (en) | Method for encoding voice signal, method for decoding voice signal, and apparatus using same | |
JP6752936B2 (ja) | ノイズ変調とゲイン調整とを実行するシステムおよび方法 | |
MX2011000383A (es) | Esquema de codificacion/decodificacion de audio a baja tasa de bits con pre-procesamiento comun. | |
EP1979895A1 (en) | Method and device for efficient frame erasure concealment in speech codecs | |
WO2010028301A1 (en) | Spectrum harmonic/noise sharpness control | |
MXPA04011751A (es) | Metodo y dispositivo para ocultamiento de borrado adecuado eficiente en codecs de habla de base predictiva lineal. | |
WO2009067883A1 (fr) | Procédé de codage/décodage et dispositif pour le bruit de fond | |
EP2202726B1 (en) | Method and apparatus for judging dtx | |
CN108231083A (zh) | 一种基于silk的语音编码器编码效率提高方法 | |
Krishnan et al. | EVRC-Wideband: the new 3GPP2 wideband vocoder standard | |
CN101651752B (zh) | 解码的方法及装置 | |
Patel et al. | Implementation and Performance Analysis of g. 723.1 speech codec |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 09726234 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 4288/DELNP/2010 Country of ref document: IN |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2009726234 Country of ref document: EP |
|
ENP | Entry into the national phase |
Ref document number: 20107016392 Country of ref document: KR Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2010130664 Country of ref document: RU |