|Publication number||US6333763 B1|
|Application number||US 09/229,028|
|Publication date||Dec 25, 2001|
|Filing date||Jan 12, 1999|
|Priority date||Jan 13, 1998|
|Publication number||09229028, 229028, US 6333763 B1, US 6333763B1, US-B1-6333763, US6333763 B1, US6333763B1|
|Original Assignee||Nec Corporation|
|Export Citation||BiBTeX, EndNote, RefMan|
|Patent Citations (17), Non-Patent Citations (1), Referenced by (13), Classifications (9), Legal Events (9)|
|External Links: USPTO, USPTO Assignment, Espacenet|
The present invention claims priority from Japanese Patent Application No. 10 004726 filed Jan. 13, 1998, the contents of which are incorporated herein by reference.
1. Field of the Invention
The present invention relates to a compression technique for compressing and coding an audio data input together with a motion picture data. Particularly, the present invention can be utilized in compressing data in a personal computer.
2. Description of Related Art
In handling a picture data and an audio data in a personal computer, a data compression/expansion technique has been used in order to reduce an amount of data. An algorithm called MPEG compression is generally well known among conventional data compression/expansion techniques. The MPEG compression is a technique for handling a large amount of data as a smaller amount of data, so that it is possible to reduce the amount of data by increasing the compression rate if a degradation of picture quality is allowable or it is possible to reduce the compression rate when a high picture quality is required. Currently, MPEG2 compression technique obtained by improving the basic MPEG compression technique is being used. With the MPEG2 compression technique, picture data is compressed at a frame rate of 6 Mbps and audio data is compressed at a sampling rate of 44.1 kHz, as the main compression level. These numerical values are based on picture quality similar to that obtained in the current television receiver and tone quality similar to that obtained by a compact disk.
In general, a picture quality depends upon a changing rate of scene and a value of bit rate. When the changing rate of scene change is low, the picture quality is not degraded substantially even if the bit rate is reduced, that is, the number of frames per unit time is reduced. However, when the changing rate of scene is high, the picture quality is degraded considerably. In other words, when the changing rate of scene is low, a large amount of data is not required so that there is no picture quality problem occurs even if the bit rate is reduced, while, when the changing rate of scene is high, the picture quality is degraded unless the amount of data is increased, resulting in a picture which is hardly watched comfortably. In view of this fact, an algorithm using a variable bit rate processing has been developed, in which a picture whose frequency of scene change is high is compressed at high bit rate, while a picture whose changing rate of scene is low is compressed at a lower bit rate.
As mentioned, the bit rate for a picture is changed correspondingly to the necessity of further reducing the amount of data and the processing thereof.
On the other hand, the amount of audio data is small compared with that of a picture so that it is usual to code the audio data at a constant sampling frequency. However, in a general purpose equipment such as a personal computer which performs almost all processing according to a software, it is desired to compress even audio data whose amount is small to some extent since a load on a central processing unit (CPU) is large.
Japanese Patent Application Laid-open No. Hei 7-303240 discloses a technique in which, in processing an audio data accompanied with a motion picture data, an audio signal is reproduced by changing a speed of the audio signal itself in reproducing a video signal at a variable speed. In order to change the audio signal speed, the Time Domain Harmonic Scaling (TDHS) technique is used, with which it is possible to reproduce the audio signal at a variable speed without changing the interval thereof. However, this technique is used to not compress an amount of audio data but reproduce a recorded audio data while changing its speed.
Japanese Patent Publication No. Sho 59-3760 discloses a technique, in which a sampling frequency for coding and a reproducing speed in decoding are selected correspondingly to a required service. In this technique, a clock rate is arbitrarily changed under control of a transfer control device correspondingly to the service to make the coding bit rate during a storage time and the decoding bit rate during a reproduction corresponding thereto variable independently. However, this technique is used to neither flexibly change the sampling frequency in one service (a series of audio data) nor make the compression rate of the audio data accompanied with a motion picture data variable.
Other well known techniques related to the compression of the audio signal as well as the picture signal and the sampling processing in compressing them are disclosed in Japanese Patent Application Nos. Sho 56-36700, Sho 64-10717, Hei 4-38767, Hei 7-154441, Hei 8-172645 and Hei 8-205092. However, these prior arts do not make the compression rate of the audio data accompanied with the motion picture data variable.
An object of the present invention is to provide a coding method and apparatus capable of effectively compressing an audio data at a variable compression rate, in coding and compressing a motion picture data and the audio data.
That is, according to the present invention, the audio data coding method for coding the audio data input together with the motion picture data is featured by variably setting a sampling frequency of the audio data according to a scene represented by the motion picture data.
The coding apparatus according to the present invention realizes the above mentioned coding method and is featured by comprising sampling means for sampling an audio data input together with a motion picture data, coding means for coding data obtained by the sampling means and a sampling frequency control means for variably setting a sampling frequency of the sampling means correspondingly to a scene represented by the motion picture data.
The above mentioned and other objects, features and advantages of the present invention will become more apparent by reference to the following description of the invention taken in conjunction with the accompanying drawings, in which:
FIG. 1 is a block circuit diagram of a coding device according to an embodiment of the present invention;
FIG. 2 is a correspondence of sampling frequency assignment of an original audio data and a compression data for explaining a variable sampling rate coding method of the present invention;
FIG. 3A shows a relation between the original audio data and the amount of sampled data when the data is sampled at a constant sampling frequency of 44.1 kHz; and
FIG. 3B shows a relation between the original audio data and the amount of sampled data when the data is sampled at a variable sampling frequency.
FIG. 1 is a block diagram showing a construction of a coding device according to an embodiment of the present invention. The coding device shown in FIG. 1 comprises an A/D converter 11 and a sampling portion 12 which constitute an audio data coding unit provided in the coding device for coding a motion picture data and an audio data (referred to as “original audio data”, hereinafter) input together with the motion picture data, a compressing/coding portion 13 for coding data output from the sampling portion 12 and a sampling frequency control portion 14 for variably setting the sampling frequency of the sampling portion 12 correspondingly to a scene represented by the motion picture data. In this embodiment, it is assumed that the sampling portion 12 and the compressing/coding portion 13 are realized by a general purpose processor or a signal processor. Therefore, the original audio data which is an analog data is digitized by the A/D converter 11 and, then, a resultant digital data is sampled.
Describing the audio data coding method according to the present invention briefly, a compression of a digital data by means of MPEG, etc., in a digital data processing system of such as a personal computer can be performed without waste by sampling the digital data adaptively at an optimal sampling frequency at which a required tone quality suitable for a scene is obtainable. Further, since a compressed data to be produced is sampled at an optimal sampling frequency, a high frequency sampling is performed for a scene in which a high quality data is required and a low frequency sampling is performed for a scene in which high quality is not required. Therefore, the amount of compressed coding data is reduced and the amount of processing is also reduced compared with a case where the data is sampled at a constant high sampling frequency.
FIG. 2 shows an example of a sampling frequency assignment of the original audio data and the compressed data. It should be noted that the compressed data is shown in an enlarged scale. In the same figure, AAU indicates an Audio Access Unit.
When a user compresses the original audio data, a sampling frequency for the original audio data is set by the sampling frequency control portion 14 for every scene of the motion picture. The sampling portion 12 samples the digitized original audio data by using the thus set sampling frequency. The sampled data is coded by the compressing/coding portion 13. Since the compressed data is usually produced by the compressing/coding portion 13 in a specific unit which is not always synchronized with a switching of scene of the motion picture data corresponding to the original audio data, the switching of the original audio data is not always coincides with a switching of the compressed data.
It is assumed here that an audio data of a movie, etc., is compressed and coded and that a motion picture data corresponding to the original audio data is constructed with a music scene, a human voice scene, a silent scene and a scene in which a car is running (car sound), etc. In such case, since the silent scene and the scene in which a car is merely passing through does not require so high tone quality, a low sampling frequency is set in such scenes. On the other hand, a high sampling frequency is assigned to scenes such as music and human voice which requires a high tone quality.
That is, a sampling frequency of 44.1 kHz compatible with a compact disk (CD) is assigned to the music scene which requires a high tone quality, a sampling frequency of 16 kHz or 32 kHz is assigned to the scene containing voices which requires a middle tone quality and a low sampling frequency of 8 kHz is assigned to the silent or car scene, etc., which does not require high tone quality. As mentioned above, since the compression data unit does not always synchronized with the switching of scene, a high sampling frequency is set for a scene which covers the unit by stretching the scene to some extent.
In order to expand (reproduce) a compressed data, an information related to the sampling frequency is described by adding an AAU to the compressed data as a header by the compressing/coding portion 13. It is possible to expand and reproduce the compressed data at a sampling frequency corresponding to the compressed data on a receiving side of the compressed data on the basis of the information described in the header portion.
FIGS. 3A and 3B shows a relation between the original sound data and the data amount after the sampling, in which FIG. 3A shows a case where the compressed data is sampled at a constant sampling frequency of 44.1 kHz and FIG. 3B shows a case where the compressed data is sampled at a variable sampling frequency. Referring to FIG. 3A, since the sampling frequency is 44.1 kHz constantly in the conventional method, the amount of data for each of the respective data portions is the same as that of the AAU. On the contrary, in the case shown in FIG. 3B, a variable sampling frequency with maximum being 44.1 kHz and minimum being 8 kHz is assigned to each of the respective scenes. Therefore, the amount of data of a scene to which a low sampling frequency is assigned is small.
As mentioned, it is possible to reduce the amount of data to be compressed and coded by the compressing/coding portion 14 to thereby reduce the amount of processing thereof, by compressing and coding the original audio data by variably setting sampling frequencies optimal to the respective scenes. On the other hand, the quality of the compressed data is low for a scene to which a low sampling frequency is set. However, in the silent scene or the running car scene, some degradation of tone quality may be negligible and is advantageous in data processing. If the audio data is sampled at high sampling frequency in the silent scene, the data processing therefor is useless.
As described, according to the present invention in which the sampling frequency of the audio data is changed correspondingly to the scene of motion picture such that a high quality compressed data is produced for a scene which requires a high quality and a low quality compressed data is produced for scenes including a silent scene which do not require a high quality, it is possible to produce a compressed data of optimal quality to scenes without waste of sampling processing to thereby reduce the amount of compressing/coding data and the processing amount thereof, compared with the conventional case in which the sampling is performed at a constant sampling frequency.
|Cited Patent||Filing date||Publication date||Applicant||Title|
|US5231492 *||Mar 16, 1990||Jul 27, 1993||Fujitsu Limited||Video and audio multiplex transmission system|
|US5461619 *||Jul 6, 1993||Oct 24, 1995||Zenith Electronics Corp.||System for multiplexed transmission of compressed video and auxiliary data|
|US5500672 *||Mar 2, 1994||Mar 19, 1996||Matsushita Electric Industrial Co., Ltd.||Multi-media communication apparatus for transmitting audio information, video information and character information simultaneously|
|US5512939 *||Apr 6, 1994||Apr 30, 1996||At&T Corp.||Low bit rate audio-visual communication system having integrated perceptual speech and video coding|
|US5548346 *||Nov 4, 1994||Aug 20, 1996||Hitachi, Ltd.||Apparatus for integrally controlling audio and video signals in real time and multi-site communication control method|
|US5553220 *||Sep 7, 1993||Sep 3, 1996||Cirrus Logic, Inc.||Managing audio data using a graphics display controller|
|US5617145 *||Dec 22, 1994||Apr 1, 1997||Matsushita Electric Industrial Co., Ltd.||Adaptive bit allocation for video and audio coding|
|US6067126 *||Jan 5, 1998||May 23, 2000||Intel Corporation||Method and apparatus for editing a video recording with audio selections|
|JPH0438767A||Title not available|
|JPH0738437A||Title not available|
|JPH07154441A||Title not available|
|JPH07303240A||Title not available|
|JPH08172645A||Title not available|
|JPH08205092A||Title not available|
|JPS593760A||Title not available|
|JPS5636700A||Title not available|
|JPS6410717A||Title not available|
|1||Japanese Office Action issued Oct. 25, 2000 in a related application with English translation of relevant portions.|
|Citing Patent||Filing date||Publication date||Applicant||Title|
|US6741649 *||Feb 3, 2000||May 25, 2004||Nec Corporation||Coding apparatus for audio and picture signals|
|US7336747||Jan 20, 2004||Feb 26, 2008||Digital Compression Technology||Coding system for minimizing digital data bandwidth|
|US7835627 *||Apr 3, 2006||Nov 16, 2010||Stmicroelectronics S.A.||Method and device for restoring sound and pictures|
|US7961258 *||Jun 14, 2011||Lg Electronics Inc.||Image display apparatus having sound level control function and control method thereof|
|US8682664 *||Sep 27, 2011||Mar 25, 2014||Huawei Technologies Co., Ltd.||Method and device for audio signal classification using tonal characteristic parameters and spectral tilt characteristic parameters|
|US20020177915 *||May 20, 2002||Nov 28, 2002||Akinobu Kawamura||Audio amplifier circuit with digital audio interface and codec device using the same|
|US20040208271 *||Jan 20, 2004||Oct 21, 2004||Gruenberg Elliot L.||Coding system for minimizing digital data bandwidth|
|US20050036069 *||Aug 11, 2004||Feb 17, 2005||Lee Su Jin||Image display apparatus having sound level control function and control method thereof|
|US20060245732 *||Apr 3, 2006||Nov 2, 2006||Stmicroelectronics S.A.||Method and device for restoring sound and pictures|
|US20120016677 *||Jan 19, 2012||Huawei Technologies Co., Ltd.||Method and device for audio signal classification|
|WO2004066501A2 *||Jan 20, 2004||Aug 5, 2004||Digital Compression Technology, Lp||Coding system for minimizing digital data bandwidth|
|WO2004066501A3 *||Jan 20, 2004||Dec 21, 2006||Patrick Antaki||Coding system for minimizing digital data bandwidth|
|WO2015149115A1 *||Apr 1, 2015||Oct 8, 2015||Barratt Lachlan Paul||Modified digital filtering with sample zoning|
|U.S. Classification||348/484, 348/738, 704/E19.039|
|International Classification||G10L19/14, G10L19/00, H03M7/30, G11B20/10|
|Jan 12, 1999||AS||Assignment|
Owner name: NEC CORPORATION, JAPAN
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:TANAKA, NOBUYUKI;REEL/FRAME:009714/0442
Effective date: 19981224
|Jun 1, 2005||FPAY||Fee payment|
Year of fee payment: 4
|May 27, 2009||FPAY||Fee payment|
Year of fee payment: 8
|Aug 20, 2009||AS||Assignment|
Owner name: CRESCENT MOON, LLC, CALIFORNIA
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:NEC CORPORATION;REEL/FRAME:023119/0734
Effective date: 20090616
|May 2, 2012||AS||Assignment|
Owner name: RPX CORPORATION, CALIFORNIA
Effective date: 20120420
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:OAR ISLAND LLC;REEL/FRAME:028146/0023
|Aug 1, 2013||AS||Assignment|
Owner name: HTC CORPORATION, TAIWAN
Effective date: 20130718
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:RPX CORPORATION;REEL/FRAME:030935/0943
|Aug 2, 2013||REMI||Maintenance fee reminder mailed|
|Dec 25, 2013||LAPS||Lapse for failure to pay maintenance fees|
|Feb 11, 2014||FP||Expired due to failure to pay maintenance fee|
Effective date: 20131225