|Publication number||US2151091 A|
|Publication date||Mar 21, 1939|
|Filing date||Oct 30, 1935|
|Priority date||Oct 30, 1935|
|Also published as||US2098956|
|Publication number||US 2151091 A, US 2151091A, US-A-2151091, US2151091 A, US2151091A|
|Inventors||Homer W Dudley|
|Original Assignee||Bell Telephone Labor Inc|
|Export Citation||BiBTeX, EndNote, RefMan|
|Referenced by (48), Classifications (13)|
|External Links: USPTO, USPTO Assignment, Espacenet|
Filed oct. 5o, 1935 3 Sheets-Sheet l uz... Y
PWN? m23 uw OL. .u wm-UZNDGNKL. IUmumm' /Nl/E/vToR H. W. DUDLE'V 19V,l
March 21, 1939. H. w. DUDLEY 2,151,091
SIGNAL TRANSMISSION Filed Oct. 50, 1955 5 Sheets-Sheet 2 ATrRA/Er u TENT 2 1 .eohmfu w hz H EL l n l- OolOv ISN l ...Ed '7 1H ov-w ...so Povm 1 March 21, 1939. H. w. DuDLl-:Y 2,151,091
-smNAL TRANsmssIoNv Filed oct. :50, 1955 sheets-sheet 5 A T TORNEI Patented Mar. 2l, 1939 UNITEDL STATES SIGNAL TRANSMISSION Homer W. Dudley, Garden City, N. Y.,
Bell Telephone Laboratories, York, N. Y., a corporation of assixnor to Incorporated, New New York Application October l30, 1935, Serial No. 47,393
'I'his invention relates to wave transmission, as for example, signaling. l
An object of the invention is to reduce the frequency range required for the transmission of signals, as for instance, speech,
Further, it is an object of the invention to effect such reduction without necessitating increase in the time of transmission.
It is also an object of the invention to facilitate transmission of a message over a transmission medium not adapted to readily pass the frequency band originally occupied by the message, as for example, to facilitate telephoning over long submarine cables where the higher speech frequencies cannot readily be transmitted because of excessive attenuation.
Another object is to facilitate increasing the number of signaling channels that a wave transmission medium, as for example, a land line or a radio circuit, can transmit in a given frequency range.
Another object is to facilitate increasing the signal-to-noise ratio obtainable at the receiving end of a system without increasing the sending level, as for instance, in long distance radio communication where the transmitting power required tends to become excessive.
It is also an object of the invention to promote secrecy in transmission of signals, for instance, in wire or radio transmission of speech.
Objects of the invention are also analysis, synthesis, production, reproduction, reconstruction, simulation, irritation, modification and control of sounds and waves representing sounds, especially speech sounds and vocal eifects of usual and unusual character.
I'he infomation transmitted by speech Vds not absolutely require all the frequency space allotted to it in the human voice. A speclc case can be worked out as to how much frequency band is required as a minimum, for example, by determining and taking account of the number of the independent variables or parameters involved in speech production (i. e., the number of the independently movable physical elements of the vocal system that are involved in speech production) and the rate at which these can vary. 'Ihis is done hereinafter, from which it is estimated that speech from 250 to '7100 cycles can be represented almost perfectly by the information that can be passed inI a 500 cycle band or so using the same time of transmission as the speech itself requires.
The well-known rule of conservation of the product of transmitted frequency band width and i time of transmission, formulated by Hartley (Transmission of Information-Bell System Technical Journal, July, 1928), which says that to produce a replica of a given piece of transmission information, the frequency band in the transmitting medium can be decreased by anyde# sired factor only when a time of transmission is used that is longer by the same factor, applies to the case when no information as to the signal source at the transmitting end of the system is available at the receiving end or when such infomation is available but cannot be `used in a practical Way at the receiving end to reproduce information which is not transmitted over the line.- Also this rule is stated for the case of distortionless transmission. Where there is distortion in the transmission medium some types are less objectionable to the ear than other types, phase distortion, for instance, within wide limits being unimportant in telephony. Thus a given frequency band with a certain amount of distortion of one type maytransmit intelligence better than a wider frequency band. having distortion of a more objectionable type.
With this understanding as to the rule. it will be clear from the disclosure herein that nothing is proposed herein which is.at all contrary to the rule.
Analyzing the vocal system oi aman from the broad viewpoint of producing speech sounds it is 35 seen to be made up of two types of parts, (1) fixed and (2) variable, 'I'he variable parts as considered here are those that vary in position from sound to sound. Examples are the lips and teeth opening and closing, the tongue shifting forward and backward, the vocal cords varying in tension and the uvula opening and closing the nasal passage. The term flxed isherel used in its broadest sense. It not only includes parts that are not moved from sound to sound in speech, such vas the nasal passages, the pharynx and much of the larynx but it also includes any fixity of feature. As an example, the fact that the vocal cords are always used in the voiced sounds is' a fixed feature. as is also the fact that they always vibrate in the same buzzer-like way as regards the presence of a fundamental frequency and all of its overtones up to a large number greater than 30; the variation of the fundamental frequency, or pitch, of the vocal cords is, on the other' hand, a variable feature as noted above.
The whole vocal system may be likened to a mechanical-acoustical oscillator with certain fixed circuits and certain variable mechanical elements. This circuit is the same from man to man. It is the same from sound to sound in the same man with the different circuit elements taking on different values to produce the different sounds. To make the analogy just referred to more specific, consider that the vocal system is, in principle, like the ordinary electrical oscillator mounted in a box asv a fixed piece of apparatus, the variability being obtained by switches for starting the oscillator and for choosing the desired inductances, by continuously variable dials for selecting the capacitance, and by step variable dials for adjusting the resistances controlling the output. With such an arrangement other features such as feed-back may also be controlled. This vocal circuit, xed in that no different circuit may be substituted therefor, is probably the most important fixed part or element of the vocal system.
Just as the oscillator when oscillating is still essentially a fixed piece of apparatus to which variable controls of frequency, output and feedback are applied, just so the fixed vocal system includes the condition of the vocal cords vibrating 'at an average or other specified steady rate to which condition the controls for varying the generated signal can be applied. The importance of including a normal vibration of the vocal cords as a fixed feature is due to the fact that they vibrate on the average at 100 to 150 cycles per second whereas the variable controls specified can change only at rates ten or more times smaller than this. Strictly the fixed features include a multi-oscillator source of energy rather than a single one, for not only are there the periodic oscillations produced at the vocal cords, but there are also non-periodic or random oscillations produced by the passage of air through constrictions or restricted'openings such as between lip and teeth for the f sound, between tongue and hard palate for the sh sound between the vocal cords themselves for whispering, etc.
From another point of view the variable parts are those controlled volitionally and the fixed parts those controlled non-volitionally if changed at all; Thus one chooses the lip position he desires for a particular sound but in the case of the/vocal cords he blows a puff of air through them when they are held with a certain tenseness and they then vibrate at their natural frequency.
This differentiation between fixed and variable features may be applied to characterize the'type of signal produced just as it has been applied to the type of element producing the signal. In this case the fixed features correspond to the sustained oscillatory sound produced with the various elements or parts 'of the vocal system in an average or normal position. This means an average lip opening, an average vocal cord tension, etc. The
variable features correspond to the changing or modulating of the sound by varying the different elements from their average positions. A particular set of variations leads to silence as an extreme case of modulating the normal speech sound. In Harvey Fietchers book Speech and Hearing", i929, there are shown a number of iowea up until it totais enough so that the original -wave form is lost, this is found to occur many periods later, oftentimes twenty or more requiring .200 second or so. This latter type of change is the modulatory type of change. It is easiest seen in a single sound as a building-up of the peak amplitudes to a maximum and then a falling off to zero again. In this case a complete change corresponds to a large part or all of a-syllable and such change is therefore known as a syllabic frequency change.
From the foregoing it is evident that speech has a dual characteristic. On the one hand we have fixed parts or elements setting up oscillatory waves containing high frequency patterns. On the other hand we have varying parts or elements setting up modulatory waves of low syllabic frequency pattern. l
An ideal frequency range compressing system would be one that took advantage of the dual nature of speech by setting up at the receiving end all the fixed features and transmitting over the line from the sending end only information as to the instantaneous .positions of the variable parts. The frequency range compressing .system need not be so identical a copy of the vocal system however. It is well known that one set of parameters can be substituted for another without any loss of denition so long as the number of independent parameters remains unchanged. Any change from the simple ideal mentioned does generally lead to a larger number of required parameters because the newly selected ones are not independent. However, this is not of much practical importance, for the great saving of frequency range accomplished by the invention is obtained by transmitting syllabic frequencies i corresponding to the variable or modulatory controls of the speech signal and not transmitting the high frequency pattern corresponding to the fixed oscillatory characteristics of the speech sounds. As this saving is of the order of 10 to 1, a 50 to 100 per cent increase in the parameters required to satisfactorily represent speech is unimportant.
As pointed out in detail hereinafter, the number of the independent variables involved in the production of speech is small. That is, the number of the movable or variable elements of the vocal system that are controlled as parameters to give the desired speech production and are movable or variable substantially independently of one another by the muscles of the vocal system is small, or in other words the number of variables or parameters that can be controlled substantially independently in speech production is smalljf being of the order of ten. Moreover, as indicated above and discussed hereinafter, for each of these physical or physiological elements, the minimum time in which it can go .through a complete cycle v of change in position is not less than about a tenth of a second, or in other words, each independent variable has a fundamental frequency of not over 74 variable characteristics of the speech. As explained before. these defining signals can be any signals derived from speech signals provided the derived signals give as many independently vari' Y able quantities or factors or parameters as the number of independent variables involved in the production of speech; and the chosen parameters need not be entirely independent provided their number be increased sufdciently to make up fortheir lack of independence. For example, the chosen parameters may be the average amounts I of power in subbands of the frequency band of speech waves, as broughtl out in detail hereinafter.
There exists, then, in the actual production of a complex speech wave by the vocal system, a simple set of slowly varying elements or parameters, (the independently variable elements, referred to above, of the Vocal system), that determine the variable characteristics of the signal (which are referred to above); and to transmit information that will suice for defining or reproducing the variable characteristics, it is sumcient to transmit information defining the variations of any simple set of parameters derived from the complex speech wave and corresponding to the independently variable'elements of the vocal system as regards number and independence, or as regards number of degrees of freedom of variation.
Like considerations hold in the case of other complex signals, as for example, signals which as in the caseA of speech, have a sluggish rate of change modulating or controlling a rapid rate of change of signal strength, or from the point of view of frequency, a set of low frequencies modulating a set of high frequencies. Music may be mentioned specifically as one such example. Each note has a fundamental and upper harm'onics of a resonant system formed for example from a stretched string, vibrating reed or air tube or chamber, and these fundamental and harmonic components of the notes form the set of high frequencies in the complex signal wave, the low frequency set corresponding to the rates of energizing such resonant systems, as for example by the hand striking the piano keys. Each key is an independently variable element in this case if one considers vthe keyboard as the starting point. If the hands are considered as the starting point, then the maximum number of fingers used at any one instant are the independently variable parameters. The rest of the piano mechanism is of aiixedlnature.
In accordance with one feature of the invention, to transmit a complex signal, for example speech, variant information in the signal or information regarding the variable or unpredictable characteristics of the signal is transmitted,v
instead of the complex signal wave. The waves transmitted can define the signal precisely, as regards its unknown or variable characteristics, yet have small frequency range relative to the signal wave and as short duration as the signal. 'I'he signal \deflning waves transmitted need not directly contain any intelligibility whatsoever. 'I'hus a high degree of secrecy is attainable.
In accordance with a feature of the invention as applied to transmission oi speech, for example, advantage is taken of the fact that much of the information ordinarily transmitted ls of an invariable or predictable character, due to the general uniformity of the speech producing organs from person to person, by reproducing such predictable information artificially at the receiving end of the transmission system in order that it need not be transmitted'. Thus, effective use is made of information or foreknowledge of the fixed or invariable characteristics of the signal lsource, with the result that the frequency band width of transmission can be reduced.
In one specific aspect, the invention is a system in which a speech signal is analyzed for its fundamental frequency and the average power in properly chosen subbands of frequency, this information lbeing transmitted either directly or by carrier currents, for example, and then used at the receiving end to fashion waves from a local multi-frequency source into a simulation of the signal.
To fashion the simulation of the signal from the waves supplied by the local source, frequency subbands of those wavesare selected which are respectively coextensive with the chosen subbands of the speech signal, and the average power in each subband of the locally supplied waves is varied in accordance with the power in the corresponding chosen subband of the signal wave.
This variation is effected in response to the information transmitted from the sending end of the system regarding the average power in chosen subbands of the signal wave.
As indicated above, two types of power spectra or frequency patterns of energy are used alternately in speech, (l) a continuous spectrum in4 the case of hissing or unvoiced sounds and (2) in the case of voiced sounds a discrete spectrum with a variable fundamental and with all of the upper harmonics always present. Hence, the local source provided is preferably such that the Waves supplied by it can have either type of power spectrum. 'I'he type and the fundamental frequency can be varied in response to the information transmitted from the sending end of the system regarding the presence or absence of a fundamental frequency in the speech wave and the magnitude of any such fundamental frequency.
Since significant changes in the fundamental frequency of the speech sounds and in frequency distribution of power in speech can take place only at a rate which is limited by the sluggishness of the muscles of the vocalsystem to less than about 10 cycles per second, (a frequency much lower than the fundamental frequencies, such as 60 to 500 cycles, of vocal cords), it results that the equipment required at the sending end of the system for analyzing the speech signal for fundamental, and likewise the equipment provided at the receiving end of the system for respending to the transmitted indications as to the fundamental frequency of the speech sounds, need only beresponsive on the line side) to frequencies up to perhaps one to three times the frequency of 10 cycles per second, just mentioned,
depending on the accuracy desired in the trans-v mission of the indications.
Moreover, the number of subbands analyzed for power content need not exceed i'ive or ten for,
. fixed features and variable features. The fixed ,features include (a) the existence of definite frequency subbands in which the power distribution is sensibly uniform, (b) the existence of a frequency spectrum that alternates from the continuous type to a discrete type with varying fundamental and with all upper harmonics always present, and (c) the fact that time variations of the fundamental frequency and of the power in the frequency subbands occur only at syllabic frequency rates. The variable features include (A) the magnitude of the average power in each subband, and (B) the nature of the signal spectrum (as to whether it is continuous or discrete and in the latter case, as to what is the magnitude of the fundamental frequency). Since there is foreknowledge at the receiving end as to the fixed features or characteristics of the signal, they can be supplied locally at the receiving end and it is unnecessary to transmit information regarding them. Their supply locally is accomplished by the choice of the type of circuit, the choice of elements to simulate the vocal cords and the eddying constrictions of the vocal system, and the choice of frequency subbands.- It is now sufficient to transmit information defining the variable characteristics and combine them with the locally supplied xed circuit features to reproduce the signal.
As indicated above and hereinafter, the invention is by no means limited to this specific form.
In accordance with a feature ol' the invention the character or expression or vocal effect of synthesized or reproduced vocal sounds can be controlled or modified by suitable control of their fundamental frequency.
For example in the frequency range reducing and restoring system described above, when normal reproduction of speech is desired the variation of the fundamental frequency in the reproduced speech is made the same as in theoriginal speech; or in other words, when normal reproduction of speech is desired the value of the ratio between the fundamental frequency set up at the reproducing end of the system and the fundamental frequency of the input at the send# ing or analyzing end is maintained at unity. Further, the valueof the ratio may be made to undergo changes or deviations from unity, or the fundamental frequency of the speech or vocal sounds may be made -toundergo ,modicationa suitable for producing prescribed modications in the speech or vocal sound effects. 'Ihe modiffications may, for example, be such as to give rise to odd voice effects which simulate eifects commonly observed in human voices. For instance, a person can talk in an ordinary voice and yet give rise to speech that has the character of a whisper or the character of a. chant, for example, dependent upon the condition or adjustment of the system. With the system in proper condition or adjustment he can get th'e prescribed effect with the uniformityof machine production. Thus a single person, though talking in a normal voice can produce different in different conditions or adjustments. A whisper can be obtained by manufacturing the speech or voice from energy having only a continuous spectrum, for example, resistance noise. (Energy having a contnuousfrequency spectrum or frequency pattern can'y be viewed as energy with its fundamental frequency infinitesimal or substantially zero, or in other words, can be viewed practically as substantially energy with, no fundamental frequency present.) Thus, the whisper can be obtained by conditioning or adjusting the system so that the local source at the receiving end of the system, which supplies the waves from which the speech is reconstructed,
will supply waves of only one type of spectrum,
the continuous spectrum, (instead of supplying waves of both the continuous type of spectrum and the discrete type of spectrum as when natural reproduction is desired). Then sounds voiced in the original speech are reproduced as unvoiced sounds. A chant can be obtained by manufacturing the voice from energy having only a discrete type of spectrum and a fixed fundamental frequency, as for example, by conditioning or adjusting the system so that the local source at the receiving end of the system. which supplies the waves from which the voice is reconstructed or manufactured, will supply waves of only the discrete type of spectrum and with their fundamental frequency set at an arbitrary chosen frequency such as the average vocal cord frequency instaead of variable with the fundamental frequency in the original speech as when natural reproduction is desired.
It is seen, then, that in accordance with the invention, speech can be manufactured or articially created from different types of energy spectra, as for example, from energy having a continuous frequency spectrum or from energy having a discrete frequency spectrum, or from both types of energy, by relatively varying the time flow of energy in frequency subbands of the spectrum substantially in accordance with the relative variations of the time ow of energy* -fundamental frequency is maintained at a fixed value such as the average vocal cord frequency, or having the sound of a whisper when the fundamental frequency approaches zero, i. e., when the energy spectrum is continuous.
A specific aspect of the invention is artificial production, manufacture or synthesis of speech of prescribed character, (for example, speech of usual character, or speech of unusual character such for instance as a whisper or a chant), with the aid of information as to the frequency pattern and the amplitude pattern, or in other words. information as, to the three characteristics, pitch, resonant frequency regions and loudness, of the speech sounds desired.
Also, a specific aspect of the invention is determining or obtaining these three characteristics for any particular set of speech signals by instantaneous analysis of the speech signals.
Also, a specic aspect of the invention is synthesis or artificial reconstruction or reproduction structed speech by artificially modifying the fundamental period of speech relatively to the speech duration.
Other objects and aspects of the invention will be apparent from the following description and claims: I
Fig. 1 shows schematically a system embodyingv the invention in its specic form referred to above;
Fig. 2 is a detailed showing of the transmitting circuit at the west terminal of the system, this transmitting circuit being the west speech frequency compressing circuit shown as the block in the upper left-hand corner of Fig. 1; and
Fig. 3 is a detailed showing of the receivingcircuitA atv the east terminal of the system, this receiving circuit being the east speech frequency restoring circuit .shown as the block in the upper right-hand corner of Fig. 1.
There is disclosed hereinafter a particular circuit which can be used for reducing the frequency range of speech by setting up articial currents of limited frequency range which will transmit information about .the independent variables of the speech producing organs in man. The ilrst important question is: What frequency range does speech inherently require for transmitting its information content? In the operation of such a system with high quality transmission of speech, a person at one end of a telephone line talks into a high quality subset. At the distant end thevlistener also has a high quality subset. The transmission between the two is not direct but is carried on in the following Way. At the sending end is special apparatus to tell the characteristics of the speech signal being fashioned by the talker. At the receiving end is other apparatus to receive these signals (transmitted directly or transmitted indirectly as by carrier currents) and produce speech that is a very close copy so far as the ear can determine of that at the sending end. In between is a transmission line of limited frequency range. The transmission must take place as rapidly as the speech is produced. It is desirable to determine roughly what is the minimum frequency band that can be used on the transmission line for such high quality transmission.
A rough answer to this problem can be obtained by various methods. One method is used hereinafter to obtain an answer and another to check that the answer is fairly conservative. The method used for obtaining an answer is essentially as follows. We determine how many independent variables are involved in the production of speech. Next we determine the minimum amount of time in which one of these variable elements can go through a complete cyclic :hange of position. From these the frequency aand required for the transmission of reasonably Jrecise information as to the instantaneous posi- '.ion of eac-h independent physical variable is btained. Next multiplying this frequency band )y the number of independent variables we have he total minimum frequency band required.
.For actual transmission in circuits allowances must be made for the ineiliciency of iilter and other auxiliary circuits that will be required.
It is convenient to dierentiate the production oi vowel sounds from those of consonants. As used here vowels are taken to indicate the pure vowels, the semi-vowels, the diphthongs, and the transitionals. Some thirty-four of these are listed in the book "Speech Pathology by Mr. Lee Edward Travers. They are as follows:
14 Vownrs grt moth fast lav debris gbey A set put fyi' truth it abovev ses ev 11 Drrm'nonos mt poor care mTle cafe m'use here O yer sg bare bgre 5 8mm-Vom ge no un I ady ride 4 Tnivsrrromrs um yes he we 'I'hese vowel and near vowel sounds have in general been produced artificially so that we have a very good check on the number of parameters involved in producing them. From the vocal standpoint starting at the lungs and coming out to the mouth opening the variables that may be controlled independently are:
. Lung pressure.
. Vocal cords.
. Rear mouth resonance chamber.
. Orening from rear to front resonance cham- Front mouth resonance chamber.
.' Opening from mouth.
These six are'more or less in the direct cir- ,cuit of speech whereas for a by-path we have two more 7. Nasal resonance chamber. 8. Position of soft palate and uvula (opening to nasal chamber). The eight oi these do not act completely independent of one another. Thus 3, 4 and 5, act decidedly in unison. Some do not or at least need not vary greatly as 6, the mouth opening, which may be kept xed for the production of all the vowels*l Again the soft palate (8) may open or close the nasal chamber, intermediate positions being unimportant. The eight variables given then probably reduce to five or six in practice.
We come next to the production of the remaining sounds which are classied as fricatives and stop consonants and are again divided ac cording to whether they are voiced or unvoiced.
'I'he voiced ones require the use of the vocal cords; the unvoiced ones do not. They are:
8 Faroa'rrvn Consonm'rs 4 Voiced 4 Unvoiced vine nght z ero ee these @umh agure s he 8 STOP CoNsoNAN'rs 4 voiced 4 Unvoced Re 12a-i1 so to I oy church go kite The frlcative consonants are produced with about the same position throughout of the vocal organs except that a certain air outlet or small aperture is formed at varying places. Thus for v and f it is formed from the lip to the teeth; for z and s it is formed from the upper teeiih to the lower teeth; for the two th sounds it is formed from the tongue to the teeth; for the zh and sh sounds it is formed from the tongue to the hard palate. TheA voiced consonant is made by pronounclng the unvoiced consonant but vibrating the vocal cords at the same time as though to increase the volume.
'I'he stop consonants are made by forming a stop to the passage of air in the mouth at some particular point, building pressure up behind this and then opening rapidly at the closed point so as to give an explosive sound. The stop is formed by the upper lip against the lower lip in the case of b and p, by the tongue against the upper teeth in the case of d and t, by the tongue against about the middle of the hard palate in the case ofj and ch, by the tongue against the soft palate ln the ease of g and k. In going from the unvoiced to the voiced consonant the formation of the stop or for that matter of the opening of the outlet in the case of the fricative consonants may be slightly further front or backward.
In producing the fricative and stop consonants the diierent parts of the vocal system are used diierently than in the production of vowels.
Thus the nasal resonance is of little importance, the vocal cords are not used in producing the unvoiced consonants, the large air chambers in the front and rear mouth are of much less irnportance and two new and very important factors are added in the position at which a closure is partly made and held in the case of the fricative consonants and completely made but not held in the case of the stop consonants. To list the independent variables again in the same order as before we have:
1. Lung pressure.
2; Vocal cords (for voiced consonants).
3. Nasal resonance chamber.
4. Rear mouth resonance chamber.
5. Opening between air chambers of mouth (for fricatives.
6. Front resonance chamber of mouth.
7. Position of closure or explosive opening.
Here. as in the case of vowel production, we have more parameters than are essentially independent with any large degree of freedom. Thus the vocal cords are only used for part o'f the consonants, the nasal resonance is not very important, the size of the mouth resonance chambers is -probably of limited importance. The po'- sition of the closure or opening is very important but the two are essentially the same sort parameter so they are shown as one rather than as two. Accordingly we conclude again that of these parameters 5 or 6 are ample to represent the actual variable characteristics \ln speech production.
There are a number oi' odd effects that have not in the discussion so far been allowed for to any extent. at least not intentionally. One thing of this sort is odd deformities or deficiencies in the usual oral structure. These may be in the form of missing teeth or teeth with separation, odd shapes of various parts of the palate, tongue, etc. Another type of odd efects that we have are those produced when we do unusual things with the voice such as whisper, talk in a falsetto tone, produce ventriloquistic sounds or produce what is called double voice. Whispering probably does not differ appreciably from other sound production so far as a typical analyzer of voice is concerned. It is apparently produced by closing of! the major portion of the vocal cord opening (glottis) by some muscles which originate and end in the vocal cords themselves. This leaves a small opening at the rear of the vocal cords which produces two predominant effects:
tends towards high vibration rates with eddy currents rather than a natural frequency.
However, an analyzing circuit presumably will take care of frequency as a single thing so that provided -it operates over a wide enough frequency range nothing special is required. The same is true of volume or power output. For all these odd effects it is probably reasonable to allow two or even three degrees of freedom further. However, as the eight original degrees of freedom were considered to be essentially less than eight it would seem that an allowance of eight for the total might well be approximately right. If we want to be generous perhaps we should sa'y ten. In speech over telephone circuits of limited frequency range the number might well be 20 or 30 per cent less than that required for high quality speech production; i. e., seven or eight independent variables may suffice for commercial telephone transmission, and ten for high quality transmission.
We can get a rough check on whether these eight or ten degrees of freedom are sufficient by considering the variety of distinguishable sounds we would hope to get with this many degrees of freedom as compared to the number we might compute for actual speech. For actual speech the number oi. sounds for telephonie purpose is generally taken as around fifty such as the ilfty given above. Perhaps the most that need be recognized for our purpose is around one hundred. [The International Phonetic Alphabet contains seventy-six sounds, (no diphthongs included) from a considerable variety of languages. In addition flve markings are used to indicate (l) length, (2) stress, (3) tenseness or laxness of position, (4) syllable consonants and closing consonantal parts of diphthongs and (5) nasalized vowels. Only the last seems to oier a distinct sound type for the computation being made here. Thus length applies to the time the sound is being given and not to its type: stress refers chiefly to the instantaneous volume; tenseness or laxnessof position affects the pitch and volamper ume; while syllable-consonants and the closingl consonantal parts of dipthongs and for that matter dipthongs themselves represent distinct sounds enunciated in rapid succession. The nasalized vowels like the French a do represent distinct sounds for our classification but of the twenty-'seven vowel sounds many are not nasalized. Accordingly a total of one-hundred distinct sounds i. e. the seventy-six listed plus twentyfour for the nasalized vowels seems not unreasonable to use here.] The same sound can be given at different volume levels. Probably a 30 decibel range with sixty recognizable steps of 1/2 decibel 4each would correspond to a very ne gradation basis. Fifteen steps of 2 decibel would correspond more nearly to ordinary telephone practice. As to pitch, a l per cent change is probably noticeable on a fine basis which would lead to about seventy steps per octave or one hundred forty steps altogether with two octaves. On a cruder basis corresponding more nearly to telephone practice probably twenty steps would be plenty. The only remaining item might be classified as personality. We can recognize a certain individuality or personality of sounds so.
that with a number of sounds from a certain person we succeed in guessing who is talking provided we are familiar enough with the persons voice. It is a little hard to say what is themaximum number of people we might hope to distinguish when a. single sound is pronounced at a given pitch and volume. In the rst case we will generously assume it may be as great as five hundred. More practically we will assume l it is likely not to exceed ten. These sets of figures give us for recognizable lelements in the fine grained picture a total number of 100 X60X 140X500 or about 4X10B. In the coarser picture we have 50 15 20X10 or about 15X 104.
Coming now to our artificial speech produc.- tion system, the number of steps we can recognize in each independent variable multiplied by the number of such variables gives us the total number of combinations available. The observer that recognizes the changes in the variables is essentially our receiving circuit rather than the ear so that the number to put down is the number corresponding to the ability to differentiate steps of signal level in our'chosen machine rather than 'in'- the human ear. Probably to recognize twenty gradations is not unduly optimistic. To recognize ten is reasonably conservative. On the basis of recognizing ten and having ten parameters we conclude we should recognize altogether 101 different sound elements which is a little greater than the largest number we figure that we may be required to recognize. Accordingly the ten parameters seem to be quite ample from this rough check.
If we vary any of our speech producing variables as rapidly as the controlling muscles permit we find the limiting speed is about eight to ten times per second. Accordingly each variable has a fundamental of cycles or less while produclng speech.
We are now in a position to calculate the band width required for transmitting the speech dening signals. On the basis of transmitting as ;he useful band not more than 100 per cent of ',he fundamental frequency (this is done in television and telephotography) We require a total 'requency band of 10 10 1.00 or 100 cycles for ransmitting the speech defining signals. 'I'his 100 cycles is more or less of an absolute minimum for high quality speech no matter how much money we are willing to spend. In telegraph more than 100 per cent of the fundamental frequency is required as the minimum transmitted band. Thus, in non-loaded submarine telegraph cables it is about 110 per cent (H. Nyquist); while in voice frequency telegraph where frequency 'space is less important it uses labout 350 per cent for either side-band and transmits` both. A 200 per cent basis is quite conservative. This would mean 200 cycles of useful frequency band.
A convenienti method of transmitting such speech defining information is by a miniature carrier system of ten channels, one for each independent variable. In this case filters are required to separate the channels. These fllters cannot cut off sharply but require waste space between used frequency bands. This waste frequency space may amount to 50 per cent of the used space leading then to a total of 1.50X200 cycles or 300 cycles. the basis of cycles per parameter vand a single side-band transmitted. 'Ihis will require equalizing against phase distortion, which will increase the effective delay and will require rather costly phase correcting networks. If these two conditions are not `iustifled for the circuit then there must be transmitted a frequency band Wider than a single side-band. At the least this transmitted band might be negligibly wider than a. single side-band as in the case of the telephotography filters. At the most it might be doubled for double side-band transmission.
It is well to summarize these estimates of required band width. For high. quality speech circuits Ithe minimum band width for speech defining signals is estimated to be as follows:
(1) Where cost is unimportant from 300 cycles at the most down to approximately 100 cycles as a lower limit. l f
(2) Where cost is an important item from 600 cycles down to 300 cycles.
Thus, it is found that a frequency range of approximately 50 cycles is sufficient for speech dening signals that give all the information as to the instantaneous displacement values or positions of the independent variables of speech production with suiiicient accuracy for reproducing reasonably good quality speech.
Having found that there are approximately ten independent variables in speech production, in settingup a circuit for artificial production of high quality speech we then need ten independent parameters. However, we need not use in any narrow frequency range transmission scheme these same ten. So long as the parameters are entirely independent we know mathematically that we can use any ten we choose. Not only can the ten be chosen in any fashion provided they are independent; but if they are not entirely independent enough more can be chosen to make up for the lack of independence. .Y It is advantageous to pick the ten that from an` engineering standpoint give a desirable design or the ten that give an optimum design. A particular case of much interest is that where most of them are the amounts of power insubbands of the frequency range of speech.
The interest in this case arises from the fact that it is based on using as parameters those physical quantities that are most easily measured. The easiest thing to measure is power, including current and voltage as measures of This 300 cycle band is Qn power and the easiest way to separate power into parts for the extra variablesneeded to measure tion y within a frequency subband.' This last needed factor, in view of the nature of the energy spectrum of speech sounds, is given-by the fundamental frequency of the speech sound,` considering this frequency toidegenerate to zero for i the unvoiced sounds. Fortunately We have found scribed hereinafter. f
, It isy shown in a generalschematic formin Fig.; l. Speech coming fromethe linewest reaches ther means ofy measuring this fundamental frequency f also, thusgiving -a complete set ofvspecifdcationsy for reconstructing the speech sounds. A specific system involving the -casegjust referred to is dewest terminal of the system. For simplicity a 'four-wire connectiony obtained with a bridge transformer or hybrid coil' and a balancing net-,-
1 work N,is showin at this'point.r The transmission maybe over a'two-wire circuit,` the circuit `modi- 'Iiheincomingr speech is transmitted through .circuit Iy to the west speech iications being obvious.
frequency compressing circuit 2. Fig. 2 is adetailed showing of thiswest transmitting circuitor speechfrequency compressing circuit 2. l'I'he fre- 'quency range of thespeech is considerably de- `creased in rcircuit 2 and then the resulting signals are `transmitted over the line Lof limited`fre-` `quency range of transmission to the receiving terminal at'the east.` Here there is a speechfrequency restoring circuiti which sets up' speech f signals more or less the reproduction of the speech signals energizing the west sending circuit 2. Fig.
Sis a detailed showing of this east receivingy Circuit f y or reproducing end of the system the simulating or. reproducing devices are separated into two' or' speech frequency restoring circuit 3; These speech signals set up by circuit 3 arefthen transmitted on through speech receiving circuit Ilk tok the line east. Transmission in the opposite direction is carried on over limited frequency line L by the same process, the east speech frequency compressing circuit 2' being like circuit 2 except oppositely pointed, and the west speech frequency restoring circuit 3' being like circuit 3 except oppositely pointed.
The simplest circuit to use perhaps is an electrical one where the amounts of power in small frequency subbands of the speech frequency range are used as independent variables. The power in any one such subband is not entirely independent of that in the others but is suiliciently so to insure that we do not need many extra subbands on this account. Such a circuit is relatively easy to build electrically with lters, modulators, equal'- izers, etc, as'circuit elements.
Preparatory to describing Figs. 2 and 3 in detail it is noted that they show a speech frequency range reducing and restoring system having means for reducing the frequency range of speech by analyzing the speech so as to determine or measure its unknown characteristics in terms of speech-defining currents of limited frequency range, a transmitting medium (which may have a limited frequency range of transmission) for transmitting these defining currents or defining' signals set up or created by the analyzing means, and means for reconstructing, restoring, simulating, or substantially reproducing the speech in or speech-defining signals which are set up to replace the speech signals should not only be simlciently complete and unique to define the speech f signals without error or confusion, but moreover should be practical and convenient, -so that. ar simple automatic analyzing circuit can be usedr "to create or vset up the speech-defining signals gr ywhen a speech wave is applied. Such satisfac- -tory denition is robtainedwhen the speech-defin-r ing signals define the speech signals in terms of f f (1l `frequency pattern, and (man1-plitude pat- Frequency pattern as used here refers to the number and position of energy-bearing frequency components. There are two distinct types. yIn
, one there is a fundamental frequencyand all its upper harmonics to several thousand cycles.` This fundamental frequency is the rate `of vibration of ing (risingy inflection) or decreasing (fallinginfiection) in ordinaryvspeech althoughit can be held constant by sustaining a` sound.` In the other type ofy frequency pattern 'there is a continuous spectrum of energyvso that all frequencies are present, instead of `discrete ones; in a harmonic ratio, although this second type may be thought of as the limiting'case` of the flrsttype whenthe fundamental frequency approacheszero f Amplitude patternfas used herey refers to the v distribution of power with frequency, from instant ication of thespeech sound analyl'ted.`
` the rvocal cords and in generalis always increas- As will appear. from the description .of the drawt ings, about to be given', y analyzing speech sounds on the ybasis offrequency and amplitudepatterns is especially convenientbecause at .the receiving simple, easily recognizable groups. The devices required for producing the frequency patterns will be artificial vocal cords for which a relaxation oscillator is shown and an artificial constriction for which a source of resistance noise is shown. The devices required for producing amplitude patterns will be the artificial lungs controlling the total power put out and the articial resonances controlling the relative amount in each small frequency band. By working on an absolute basis these characteristics are all handled in a single set of modulating controls shown as gain-controlled amplifiers, the artificial lungs being the power supply batteries or sources and the artificial resonances being filters or tuned networks.
The dual nature of the speech signals as defined electrically leads to a dual type sending or frequency range reducing circuit as shown in Fig. 2 and a dual type receiving or speech frequency restoring circuit as shown in Fig. 3. The speech currents entering the sending circuit 2 energize a frequency pattern control circuit FP and an amplitude pattern control circuit AP. The frequency pattern control circuit comprises but one channel and discriminates as to the frequency pattern. 'I'his discrimination includes discrimination as to the fundamental frequency when there is one. The amplitude pattern control circuit branches into ten channels AP; to APm and determines what frequency amplitude pattern we have. The information obtained from these two analyzing elements FP and AP is then expressed in the form of electrical currents which can be used, in modulator Mo -in circuit FP and in modulators M1 to Mio in channels AP; to APio, respectively, to modulate a number of control current frequencies which. are then transmitted through channel band-pass filters Co to Cin, to the limited frequency line L on essentially a carrier system basis.
At the receiving end of line L these transmitted signals after passing through receiving amplifier RA, arex applied to receiving frequency pattern control circuit FP' and receiving amplitude pattern control circuit AP'. In circuit FP', channel band-pass filter C'o selects the waves received from channel FPI so the selected waves can act,
after demodulation in demodulator DMn, on an energy source of frequency patterns FPS so as to cause currents of the proper frequency pattern to flow from this source.
In circuit AP', channel band-pass lters C'1 to Cior, in channels APi to APm respectively, select the'currents received from channels APr to APro respectively. The selected currents, after demodulation in demodulators DM1 to DMro of the respective receiving amplitude patterny control channels APi to AP'io, are used to control shaping networks SN 1 to SN 1o in the respective channels to give the proper amplitude-frequency pattern to the power received from the energy source FPS. We then have our reproduction of the original speech signal for any further transmission in the ordinary manner.
The problem arises of selecting frequency bands to use in the amplitude-frequency characteristic determination. A basis for solution of the problem is found in the function known as the importance function, which has been established by articulation studies. This function varies with the amplitude of speech signal received by the ear. We shall assume that the -received speech signal is at the optimum power level. Ten or so independent variables are needed for good speech transmission. Ten such bands for the entire speech frequency range should then be as follows:
Impor- Y Bund No. tance Frequencies AF function Cycle@ o to 0.1 25o to 53o or 28o 0.1m .2 530m 780er 25o .2to .3 780 to 1100 or 320 .am .4 uootoiooor o .4to .5 15o0to195oor 4e) .5to .e 195cm 2350er 40o .6to .7 2350to2900or 550 .7m .s 29onm315o0r4 85o 8 to 9 3750 to 4950 or 1200' .9 to 1. o 4950 no noo or 2150 'Ihis is not necessarily the best choice of frequencies that can be made. Other considerations enter such as the percentage of increase of frequency in a band, the amount of repetition of pattern in different bands, the noticeable distortion due to smallness or largeness of band, etc. However. these considerations would not greatly affect the final choice of frequencies. Accordingly this set is deemed satisfactory. It is noted that at the most important places the bands are quite narrow in width, 250 cycles, 280 cycles, etc. The three lowest ones are the three smallest and correspond ordinarily to not more than two harmonics each in the case of voiced signals. The next three probably correspond to about three harmonicsin the average case of a male voice. Thel next one which carries us close to 3,000 cycles might correspond to four harmonics. Above 3,000 cycles they widen out quite considerably. On this frequency basis we have altogether then eleven parameters for the entire speech range. one for the frequency pattern control and ten for the amplitude pattern control. Similarly we require a total of eight parameters to go to 2.900 cycles, nine to 3,800 cycles and ten to 5,000 cycles.
it To give a better picture of the proposed circuit ious circuit elements. The estimate was made above that for high quality speech transmission the speech defining signals would require transmission frequency bands as follows:
(1) 100 cycles as the absolute minimum.
(2) from 300 cycles down to 100 cycles where fairly large costsvcan be borne.
(3) from 600 cycles down to 300 cycles where costs must be kept relatively low.
We shall arbitrarily take a 360 cycle total transmission band with other choices such as to tend to typical results after overcoming typical dimculties. This 360 cycle band is on the basis of twelve channels each of 20 cycle pass band with a 10 cycle wastespace between. Ten of these are for amplitude control as already chosen, one .is for frequency pattern control and one is for synchronizing. We assume single side-band transmission to handle signals including direct current such as is used in telephotography; this means that we must use either linear phase filters of many elements per section or use ordinary filters with complicated phase equalizing networks in addition. (The assumption of a 20 cycle pass band with serious phase difllculties may be a little conservative. However so much of our problem is inherently one of phase diiliculties that it is desirable to stress the phase effects.)
On the basis laid down so far we shall proceed to discuss: rst, the frequency pattern circuits, second the amplitude pattern circuits and finally the overall circuit.
Fanqmmcr PATTERN Gon-mor.
A particular form of the frequency pattern control circuit is shown in Fig. 2 at the transmitting end and Fig. 3 at the receiving end. Its operation takes advantage of the fact that in vowels and other sounds having a decided fundamental frequency in the range 80 to 320 cycles there is a high power level while in sounds like the sibilant is helpful to indicate frequencies for the var.
consonants where the powerV is in a continuous spectrum rather than a discrete one the power level is much lower. When a high level discrete spectrum is applied the frequency pattern control circuit sends to the limited frequency line L 'a' current indicating what the fundamental frequency so applied is, not however indicating anything about the amplitude of the fundamental frequency in the speech signal. When a low level continuous spectrum speech signal such as that of a syllable consonant is applied the circuit is not energized so no current from it is transmitted to the line.
Figs. 2 and 3 show the frequency pattern control circuit, with details of the non-linear circuit elements, the other circuit elements being more or less standard.
Referring nowl especially to Fig. 2, band-pass filter F0 selects the band from 250 to 500 cycles of the voice signal so as to be sure to include at least two harmonics of speech if the fundameni tal is below,250 cycles. This is done in case the fundamental may be cycles or so and therefore inefciently transmitted over the telephone line to the point'at which we determine the fundamental frequency. 'I'he output of this band-pass filter is fed to detector D which may be merely some small copper-oxide elements. This insures 10 that a fair amount of the fundamental frequency will be present if the power level is sufciently high as in the case of vowels. The output from the detector is sent through an attenuating network E1, of a type often termed an equalizer, which has a loss Vincreasing with fref quency for the purpose of insuring that the fundamental frequency comes out at a higher level than any upper harmonics that may be present. For practical purposes this puriiies the fundamental tone. Next the output from this equalizer is fed to a constant output amplifier LA so that from this ampliiier there is obtained essentially a single frequency, the fundamental of the speech signal, at a constant power level regardless of what frequency it is.
'I'his fundamental frequency may be from about 80 to 320 cycles. Next we pass this power through an equalizer Ez similar to the one described previously so that the output from this equalizer decreasesas the frequency increases. This output is sent through another copper-oxide detector Do. Essentially it gives a direct current bias that uctuates as the fundamental frequency of speech fluctuates, that is, at syllabic frequencies.
The detector output is then sent through a lowpass lter F cutting off at 20 cycles so that the unwanted higher frequency products are elimi-rnated. This output is now used as a bias on a push-pull amplifier Mn transmitting a 30 cycle tone obtained from 60 cycle oscillator O, multivibrator MV and selective circuit So. fThis can be considered as a balanced second order modulator, the 30 cycle carrier being modulatedl by the biasing current as a signal. When there is no energy being received from the low-pass filter Fan the bias is such that essentially none of the tone is transmitted to the line.A
In proportion as more'biasing current isV fed from the low-pass filter more of the 30 cycle tone is permitted to pass. The output of this modulator is fed to a band-pass filter Cu which passes the lower side-band, from 10 to 30 cycles, to the junction point for other control tones, thence to the transmitting amplifier TA and through it to the line. v
At the receiving end this control current goes through theA receiving amplifier RA shown in Fig. 3, after which it is picked oif through a channel band-pass filter Co passing from 10 to 30 cycles. It is then sent through a copper-oxide or other detector DMO for the purpose of rectification to syllabic frequencies.
Its rectification is accomplished by demodulating it in DMO with 30 cycle current supplied from multi-vibrator MV through selective circuit Sn. The multi-vibrator receives 60 cycle current sent from oscillator O at the transmitting end of the system through lter C11, amplifier TA, line L, amplier RA, filter Cn and phase shifter PS. This multi-vibrator and the multi-vibratorMV of Fig. 2 have identical circuits generating the harmonics of 30 cycles.
'I'he syllabic frequency currents from' demodulator DM@ are sent through a low-pass lter F'so. 0 to 20 cycles, where the output is used for two purposes. First it biases an amplier VA which amplifles resistance noise received from resistance R through amplifier A. The bias is so applied to a grid biasing resistor B1 for the amplier that when no bias is received the resistance noise from R and A is passed on through VA but as bias is received the output of VA is cut down, that is, the
gain of theamplier VA is decreased by a negativeV bias being applied. Secondly, Lthe output of F'so is applied to a biasing resistor Bz in the comvariable plate resistance Ro. It is controlled in such a way as to set up the desired fundamental frequency of voice plus all of its harmonics. To insure both even and odd harmonics the circuit is arranged to take off the outputs from the two tubes of the multi-vibrator in series and in parallel and then combine these two so as to gencrate all the harmonic frequencies. Another possible arrangement of multi-vibrator is one at one-half the frequency from which just the even harmonics are used. -When not energized by waves from filter Fao .the /muiti-vibrator circuit stops oscillating. -".l'he multi-vibrator output and the resistance noise circuit output from the variable gainamplifier VA are combined in a circuit leading to the amplitude controlling circuits. Preferably the multi-vibrator output is` Afirst passed through an equalizer E4, which serves to make the output power the same for each frequency, fundamental and upper harmonics.
If desired this end can be attained by making the coupling loose between the primary and secondary windings of the multi-vibrator output transformers, and equalizer E4 omitted.
' The attenuation-frequency characteristic of equalizer En determines the variation, with frequency, of the amplitude of the fundamental components (of speech sounds) that are delivered to detector Do; and this input to Do has its syllable frequency component detected by detector Do and transmitted by carrier current transmission to the bias resistor Bz which determines the fundamental frequency of multi-vibrator MVo. So the voltage output of equalizer En, at each fundamental frequency that E2 transmits, should be of such value that the voltage across the resistor Bz will vbe of proper value to cause the multi-vibrator to assume that fundamental frequency as the fundamental frequency of the multi-vibrator. Therefore, if the voltage transmission-frequency characteristic of the system from the outputof equalizer E2 to the bias resistor B2 is, for example, flat, or in other words if the ratio of the equalizer output voltage to the AMPLITUDE PA'r'rEaN CoNraoL The amplitude pattern control circuits are essentially circuits which at the transmitting end measure how much power there is in the speech signal in chosen small frequency bands and transmit this information by control currents to the receiving end where the output of resistance noise from VA or multivibrator harmonics froml equalizer El is shaped accordingly. The frelio quency bands are chosen as described previously.
'I'he transmitting circuit is shown in Fig. 2, and the receiving circuit in Fig. 3. The ten channels AP1 to APio and AP1 to AP'10 are used to transmit information about the amplitude pattern. Thus the channel 'I0-90 cycles transmits information about the amplitudes in the speech range Z50-330 cycles; the channel 100-120 cycles, in the range 530-780 cycles etc.
Considering the channels AP1 and AP'1, for example, the output from the 250-530 cycle speech band-pass filter F1 is fed to detector D1, which may be. for instance, of the copper-oxide type. The syllabic frequencies in the output from the detector are passed through a 20 cycle low-pass filter F31 and used tobias an amplifier or balanced modulator M1 transmitting 90 cycles, the amount transmitted depending on the bias. There is no transmission through the amplifier `when the bias applied is zero. As the bias builds up it permits transmission. The output from this amplifier or modulator is passed through a 70-90 cycle band-pass filter C1 to the Junction point of the other circuits for amplitude and frequency pattern control, and then through the transmitting amplifier TA, line L, receiving ampliier RA and 'l0-90 cycle band-pass filter C'1 to a second order modulator DM1 that has a 90 cycle carrier supplied from`MV in synchronism with the 90 cycle carrier supplied by MV to M1 at the sending end. The output of DM1 is' passed through a 20 cycle low-pass filter Fai. The resulting variable direct current is applied to a biasing resistor B3 to give a grid bias to a signal shaping network or push-pull amplifier SN1, which amplies the signal power received from the multi-vibrator MV0 or the resistance noise source VA through 250-530 cycle speech band-pass filter F'1. This is then fed through a 250-530 cycle speech band-pass filter F"1 to the input of speech amplifier SA, where the outputs fromV nine other speech band-pass lters (of channels APz to AP10) are combined to give the y original speech signal. The speech currents are then transmitted through ampller SA to the speech receiving circuit l. l
It will be understood that transmitting channels APZ to APio are like AP1 except as to frey quencies involved, and receiving channels APz to AP10 are like AP1 except as to frequencies involved; and that detectors D0 to D10 may be alike, modulators M0 to M10 may be alike, demodulators DM0 to DM10 may be alike, variable gain ampliers or signal-shaping networks SN1 to SN10 may be alike, lters F30 to F40 may be alike, and lters F'so to F4o may be alike.
OVERALL CIRCUIT CONSIDERATroNs While the outputs of detectors D0 to D10 could be transmitted directly over separate lines to filters F'ao to F'40, and the illustrated carrier apparatus could then be dispensed with, it is conat the sending end of the system should be transmitted to the receiving, end or otherwise ob- 'quency of speech sounds, is similar.
tained synchronously at that end. In the system shown, 60 cycle voltage from oscillator 0 is sent through multivibrator MV so as to get out a large number of upper harmonics of halflof 60 cycles or 30 cycles which are then selected by frequency selective circuits S0 to S10 and used in modulators M0 to M10 as the carrier supply frequencies of the eleven indicated channels FP erably the waves of the different. frequencies supplied by MV should have equal power levels. The same is true with respect to f The equality can be obtained by amplitud equalization or attenuation equalization in any suitable manner, as -for example by making the coupling loose between the primary and secondary windings of the multi-vibrator output transformers, (not shown), as mentioned above for the case of mul.- tivibrator MVo.
A number of non-linear circuits have been shown. One type, D0 to D10, is referred to as the detector circuit'. In general, its purpose issimilar to that of the detectors in voice operated circuits. that is, the received speech power is more or less rectified-giving a current component of syllabic frequency to be used as a control. The detector D, for obtaining the fundamental fre- Where separat low frequency lines are available the outputs of detectors D0 to D10 can be sent directly to the receiving apparatus. Since, usually, such separate lines are not available, the system shown makes use of different frequency ranges on the same line. Therefore .these syllable frequency control currents are translated to different parts of the frequency spectrum for transmission over the line L of limited frequency transmission range or other common medium of transmission. For this purpose modulation is used so we have a second set of non-linear circuits M0 to M10 to serve primarily as modulators. Similarly at the receiving end we have for the third type, demodulators DM0 to DM10 to give back the control currents. Next the control currents are used for two purposes each of which involves a type of non-linear circuit:
(a) To bias amplifiers VA and SN1 to SN10 and thus change the gain through them;
(b) To vary the plate resistance of vacuum tubes in circuit VR.
Sixthly there are multi-vibrators (relaxation oscillators) for (a) setting up the carrier supply frequencies in MV at the transmitting end of the system and in MV at the receiving end, and (b) setting up the frequencies in the vowel sounds in MVo. Finally there is a constant output level ampliiler LA s hown in Fig. 2. This makes a total of seven types of non-linear circuit some of which however Yhave much in common. v
Delay and delay distortion probably constitute the most serious problem in designing this system. The delay can be equalized at some expense. If however, a wider frequency band'is available on the line the delay trouble can be overcome in large degree by transmitting both side-bands. In this case they will fall in the middle of the pass band. The carrier may also lli 'be transmitted' or rewpued at me recelvinsenc. n
Let us consider now easing up in delay distorrtion will 'affect the band widthy required for transmission; yThe 'twelve channel circuit dis'- cussed here takesy a 360 cycle band for high quality speechi'llOO ycyclesl'or 270 cycles for 2900 rcycle speech rwhich; 'corresponds to ycommercial g telephone circuits. With double side-band transmission 'this would need to be made about twice y done yto reduce the :delay distortion such as iusing i i less frequency space for the useful band andr more as wide, that'is, "120y and vifcycles respectively instead' of 360' and 270 as given before.
down depending on the r1 elative importance of the total frequency band required, oi.' the total rdelays' permitted, the qualitydesired, the permissible cost etc; rA number of things can be Since the frequencypatterncontrol branch tends i f to have moreinherent delay than the amplitude control branchesit s'desirable to have acertain 'amount'of' delay ini common withfall the ampliy tude ycontrol circuits as is indicated' by ydelay Fig. 3.
yTwo sets of band-pass filters Fo to F1o and Co f f interest is obtained by dropping out the frequency i Si)rk requalizer DE Fig. 2 and delayequalizer DE in f to Cm, are shown at' :the transmittingy end and end in addition to the selective carrier supply f circuits and the low-passflters. rIl? desired, ar
two, C'o to C'io yand F'r to F'my at thereceiving third'set, Fi to F'm may befused.' at thereceiving end, making twenty, instead of ten speech bandy filters in thereceiving' amplitude control circuits, rfor the ycase ofr the 7100 cycle'band to,y be transmitted.r rThese ten or twenty speechr band filters might have roughly requirements `,of I about y64 decibel loss discrimination at the separating points between two bands, 20 decibel loss at the middle of the next band and 40 decibel loss at the middle of the second band. Assuming the requirements to be approximately these, the delay in the speech band filters is relatively. small, two of them in tandem giving only about ten milliseconds delay. These speech band filters can probably have much easier attenuation requirements than indicated here. Those at the receiving end in particular might well be made to be more nearly tuned circuits than sharp filters. This will tend to put out a rounded amplitilde-frequency envelope for speech at the receiving end rather than one with sharp corners.
'I'he set of channel filters at each end, that is, filters Co to C10 and C'o to C'm, are spaced much closer together in the frequency range and therefore inherently tend toward much larger delays. Assuming they require about 6 decibels loss discrimination at the edges of the pass bands, 20 decibels loss at the beginning of the next band and 40 decibels loss at the middle of the next band we get a set of filters that are estimated roughly to have about milliseconds delay at the edges and 50 in the middle. With this it would be necessary to equalize this delay if single side-band is used. 1f this is done the' total delay in the band will probably be 'increased to about milliseconds. The resulting lter will also be fairly costly as delay correcting networks require many elements. It is quite probable that these channel filters can be used with much easier requirements also. It this system is used y These4 band widths can of course be adjusted up or where wider frequency bands on the transmission lines rare available.y it ywill ordinarily be desirable to use double side-band transmission.v There are f other elementsin the circuit that causes. certain amount oi'r rdelay. and therefore delay distortion such as the low-nassy ilters F30 to F40 and yFau y to F'm and the constant; output amplifier LA. These, however, ywill total uponly to rather small values. basis of 20 cycle channel bands with 10 cycle separation will amount to knot over .35 second.
The ytotal koverall one-way delay onthe f y A circuit of this sort at the receiving end needs,y f good equalization of both attenuation and delay. f
l The adjustment ofL the amplitude control circuits can be made by transmitting a singleA tone in they channel under test. f
Forinstance, for transmitting 250 trol channels have been used, rwith filters F1 toy f Fe respectivelypassing 7the speech bands Z50-550 cycles, 'S50-850 f cycles, k1350-1150y cycles, y1150-1450r cycles, 1450-1750 cycles, 1750-2050 cycles, 2050- 23501cyclesp2350-2650 cyeles'and`2650--2950r y'cycles'.-,'I'henfilters F1 to Fs' respectively passed f f f f these same bands, and likewise filters 11",'1 to;
F"o'respectively passed these same bands.
Another simplification of the circuit of some MV@ to the resistance noise rsource FPS, butmust f use either f the resistancey .noise or the multitured speech has the sound of a whisper.y With the fixed .frequency multi-vibrator source the manufactured speech has the monotone of a chant. Both circuits, however, give good intelligibility at a considerable reduction in the circuit equipment required and may therefore be Very useful where naturalness is not an important factor or not a' requirement in transmitting speech sounds.
Ordinarily, with a system of the type shown the permissible range of volume transmitted might Well be somewhere from 20 to 40 decibels for example. 'I'he wider the frequency band the easier it is to get a large number of volume range steps at the receiving end without too fine adjustments 'of the apparatus. The smaller the'volume range,
the more readily can transmission be carried on over long circuits of high attenuation such as submarine cables. That is, signal levels down 35 vibrator nowset-atan arbitrarily chosen fre'- quency such as the average vocal cordffrequencvrr y :Interesting soundk effects are rthus produced. rWith the rresistance rnoise, rsource rthe manufac-y f f f k4d), f
much nearer to resistance noise can be used and therefore the c'able can be a cheaper cable of greater attenuation than could otherwise be used,
or a wider frequency range can be used on a given cable.
Any desired portion of the circuit or link between the terminals of the system may be a twowire, four-wire or radio circuit. It will be understood that'for multiplex carrier transmission of a plurality of conversations simultaneously over a common transmission circuit or medium, either by wire or by radio transmission, each speech channel may have its speech band reduced in frequency range by a speech frequency compressing circuit (such as the circuit 2 of Figs. 1 and 2) and then applied to the modulating channel of paratus at one end of the system. At the other end the band detected by the demodulator of the wire carrier or radio carrier multiplexing apparatus will correspond to this narrowed band and may be impressed upon a speech frequency restoring circuit (such as the circuit 3 of Figs. 1 and 3).
It is emphasized that the output of the speech frequency compressing circuit 2 of Figs. 1 and 2 is unintelligible, rendering it difficult for unauthorized persons to tap the circuit and affording a high degree of secrecy'in either wire or radio transmission. The Waves transmitted, through the medium L, directly contain no intelligibility whatever.
Where the circuit connecting the terminals of the transmission system is, for example, a trans- Atlantic submarine telephone cable, the reduction of the frequency range transmitted can be of great aid especially because of the difculty of inserting the desired number of amplifiers to keep downv the attenuation at high frequencies.
In multiplex telephony over cable or open wire lines or by radio transmission, the reduction of the frequency range can facilitate an increase of the number of channels on a line or in the transmission medium, as for example the placing of two or more speech channels, on the line or in the transmission medium, in the frequency range ordinarily occupied by one.
' The compression of the frequency range of the transmission in accordance with the invention, can increase the signal-to-noise ratio obtaining at the receiver without necessitating increase of the sending level of the signal. This is of especial importance in systems, for example, long radio transmission systems such as the trans-Atlantic, in which the transmission medium is unsatisfactory for the desired grade of transmission of the original communication signals because of inadequate signal-to-noise ratio in the transmission medium.
It is seen that the system described above, instantaneously analyzing speech waves, and then, upon the basis of the information obtained by the analysis, synthesizing or reconstructing the speech, illustrates doing electrically what onel does in mimicking when he listens to speech sounds with his ear and immediately reproduces ;hem with his own vocal system. In-this procass three operations may be distinguished: first, ;he aural perception and analysis of the sound; iecond, the passage of neural impulses for vthe :ontrol of the vocal organs; and third, the reproiuction of the sound. In the system described lbove these operations are electrical, or may be :onsidered electro-acoustical if the microphone .upplying the electrical speech Waves to be anilyzed and the loud-'speaker or telephone receiver ictuated bythe reconstructed electrical sound lfave are regarded as included in the system.
These operations are to be distinguished from hose of ordinary telephony. In telephony there s neither analysis nor synthesis of the sound. nstead, a listening device, the telephone transaitter, picks up speech sounds and translates hem into current variations; a complex alterating current, with components corresponding n those of the sound wave, is then transmitted a the other terminal; and there al reproducing, r talking, device in the form of a telephone reeiver reproduces the speech. In ordinary teephony a sound wave is electrically moved from ne point to another by direct transmission; but
when the synthesizing process described above is to be used, only the specifications for reconstructing the sound wave are directly transmitted.
By the analyzing device in the system described above, the electrical speech Wave coming from the microphone is instantaneously analyzed for the three characteristics, pitch, resonant frequency regions and loudness; control currents are then passed to the speech synthesizer; in this a new current is produced, manufactured according to the specification of the speech analyzer; and this current is suitable for actuating a telephone receiver toV produce acoustical speech waves.
Thus the system between the microphone and the telephone receiver analyzes a sound-bearing electrical current and synthesizes a mimicking electrical current. O n this basis the analyzing device is the artificial ear of the mimicking system and the synthesizing device is the artificial vocal system. On the other hand, if one thinks of an articial ear as a device for analyzing an acoustic speech Wave, then the microphone is to be regarded as part of the artificial ear. Likewise from the acoustical standpoint the artificial vocal system would be regarded as including the loudspeaker or telephone receiver.
Of the two basic elements, the analyzer and the synthesizer, the analyzer is like the ear in that it analyzes speech sounds to determine theirr pitch, resonantfrequency regions and loudness.
In the synthesizer described in detail above, the element equivalent to the vocal system is entirely electrical, and produces an electrical current. This synthesizer has electrical equivalents of all of the important elements of the vocal system. To bring this out, the operation of the vocal system should be considered. The three fundamental steps in producing speech Sounds are: (1) providing an energy source; (2) modulating energy from this source so that it Will contain vibrations in the frequency rangeaudble to the ear; and (3) selecting from these audible vibrations by resonance to get the various speech sounds.
The energy source is provided by the pressure on the air in the lungs as it is being exhaled. In a normal exhalation this fioW of air is not audible. It is made audible by forming a partial closure and forcing the air through under pressure. The air particles which get through then vibrate in rapid complex motions at frequencies that are audible to the ear. For voiced sounds, this partial closure is at the vocal cords which then vibrate at their natural periodic rate, depending on the tension the talker places on them. For unvoiced sounds the closure is in the mouth passage. Thus s is formed by placing the tongue against the hard palate and forcing the air through the narrow opening between them. After these audible vibrations are setup it is still necessary to provide means for differentiating the various sounds from one another. This is done by shaping the mouth to form resonant air chambers which favor certain frequencies but discriminate against others. It is in this way that a vdiiers from ee although in both cases thevocal cords are vibrating.
Starting from the lungs and going out toward the mouth there are (1) the steady power source in the lungs, (2) the vibration producers in the form of the vocal cords for the voiced sounds and the constrictions in the mouth passage for lthe unvoiced sounds, and (3) the resonance elements in the form of resonant air chambers.
The articial speech synthesizer shown in the drawings has elements corresponding to these three elements of the vocal system as follows: (l) The steady power supply may be thought of as from batteries instead of compressed air in the l s. (2) The vibrating elements are the multibrator for the vocal cords and the resistance noise source for the unvoiced sounds pro' duced at the constriction in the mouth. (3) Resonance or frequency discrimination, which determines the frequency components tov be favored and those to be discriminated against, is provided by tuned or frequency selective electrical networks instead of air chambers as in the mouth. 'Ihe output is an electrical wave. instead oi.' an acoustic wave.
When a speech signal is analyzed by the artificial ear it is resolved into simple speech-defining signals or control currents. However, as indicated above, these currents can be modiiled or controlled by controlling the circuit elements in their paths and thus the manufactured speech can be made todifi'er in prescribed manner from the original speech.
The invention is capable of embodiment in widely different systems.
What is claimed is:
1(e In communicating intelligence of a speech predictable characteristics, and so combining efi fects of the transmitted waves received from the transmission medium and the generated waves not received from the transmitting medium as to create'a synthetic speech signal containing substantially the same intelligence as the original speech signal.
2. A signaling system comprising means supplying signals that contain modulated waves whose modulations correspond to other waves of lower frequency. the unknown characteristics of said signals being dennable by said other waves and a set of unmodulated waves, neither said other waves nor said unmodulated waves being sufilcient to give intelligibility of the signals, means deriving from. said signals -said other waves to the exclusion of said unmodulated waves, wave generating means supplying under control of said derived waves a set of waves of` frequencies not receivedby said generating means from said signals but comprised in the frequency range of said unmodulated waves, and means responsive to said derived waves and said generated waves for reproducing the signals.
3. A signaling system comprising means deriving. from speech signal waves having predictable characteristics and unpredictable characteristics, waves of frequency below the fundamental frequency of vibration of the vocal cords defining the unpredictable characteristicsl to the excluysion of the predictable characteristics, means for 4. The method which comprises analyzing communication signal waves from the standpoint of oscillatory and modulatory characteristics, transmitting to a receiving point waves representing the modulatory characteristics and having a smaller product of frequency band width and time than said signal waves, producing locally at the receiving point waves corresponding to the oscillatory characteristics and having frequencies not present in the waves transmitted to the receiving point, and utilizing the produced waves and the transmitted waves to create waves serving as a substitute signal for the signal represented by the waves analyzed.
5. A communication system having a sending end and a receiving end for communicating a signal which is produced by a signal source and whose production involves mechanical motions, said system comprising means at the sending end of the system for analyzing the signal into variable features and features of relatively nxed nature and further analyzing the variable features into simpler signals for defining the motions involved in producing the signal to be communicated, said defining signals having the summation of their products of, frequency band 1 trolled by said transmitted defining signals for reproducing said variable features.
6. A system for communicating a signal whose production involves mechanical motions comprising means for analyzing the signal into features of variable nature and features of relatively fixed nature and further analyzing the features of variable nature into signals having a smaller product of frequency range and time for defining the motions-involved in producing the signal to the communicated, means for reproducing the features of relatively fixed nature independently of the signal analyzed, and means controlled by said reproducing means and said defining signals for reproducing the signal to be communicated.
7. The method of treating a speech signal whose generation involves sluggish muscular movements causing variations in acoustic pulsations resulting from rapid vibration of vocal cords, which comprises deriving, from said signal, waves of frequency lower than said acoustic pulsations, that define said variations but not said pulsations.
8. The method which comprises dividing a frequency band' of waves into subbands respectively having syllable time rates of change of energy, measuring said respective rates, generating waves having a discrete energy spectrum with substantially the same power at each frequency of energy concentration, and varying" at said `respective measured rates the energy in respective frequency bands of said generated waves.
9. A system comprising a plurality of channels for transmitting energy having syllabic time rates of change in the channels, respectively, a source of waves having a discrete energy spectrum and having the relative amplitudes of their components independent of the relative amplitudes of the components of said transmitted energy, and means controlled by the energy in said channels for varying at said rates, respectively, the energy in different frequency bands ofv waves from said source.
10. Aisystem for communicating a message represented by a frequency'band of Waves, comprising a source of waves having a discreteenergy spectrum and 'having the relative amplitudes of their components independent of the relative amplitudes of the components of the first-mentioned waves, analyzing circuits responsive to said frequency band of waves for determining the syllabic time rate of change of en ergy in each of a plurality of subbands of said frequency band of waves, and means controlled by said analyzing circuit for producing in corresponding frequency subbands of the output from said source a similitude of these time rates of change of energy.
11. The method which comprises dividing a frequency band of waves that represents a voice wave into subbands respectively having syllabic time rates of change of energy', measuring said respective rates, producinga complex tone having its fundamental frequency equal to that of said voice wave but having the relative amplitudes of its components independent of the relative amplitudes of the components of said voice wave, and varying at said respective measured rates the energy in respective frequency bands of said complex tone.
12. A system comprising a plurality of wave transmission channels for respectively transmitting subbands of a frequency band of waves that represents a voice wave, means for producing a complex tone having its fundamental frequency equal to that of said voice wave but having the relative amplitudes of its components independent of the relative amplitudes of the components of said voice wave, analyzing circuits responsive to waves from said channels for determining the syllabic time rate of change of energy in each of said subbands, and means controlled by said analyzingv circuits for producing from the energy if said complex tone a similitude of said time rates of change of energy in said subbands.
13. The method which comprises dividing la frequency band of Waves into subbands respec- ;ively having syllabic time rates of change of energy, measuring said respective rates, generiting independently of said waves other waves raving a continuous frequency spectrum and substantially uniform distribution of power with requency over the frequency range o f said bands, ind varying at said respective measured rates the energy in respective frequency bands of said renerated Waves.
14. A system comprising a plurality of wave ransmission channels for respectively transmiting subbands of a frequency band of waves that epresents a voice wave, a source of` waves repreenting resistance noise, analyzing circuits reponsive to Waves from said channels for deterdining the syllabic time rate of change of enrgy in each of said. subbands, and means conrolled by said'analyzing circuits for producing n the output from said source a similitude of aid time rates of change of energy in said subands.
15. The method of expressing the information ontent of speech signals in a compressed freuency range which comprises setting up an enrgy pattern and varying'it in a limited number f respects, not exceeding approximately ten, orresponding to a limited number of muscular hanges used in speech production and with a mited time rate of change in each respect corvaried in producing speech. y
16. The method which comprises creatingr an energy pattern varying in a limited number of respects corresponding to a limited number of muscular changes used in speech production and with a limited time rate of change in each respect corresponding to the limited rate, less than the fundamental rate of vibration of vocal cords, at which the speech producing muscles can be varied in producing speech, generating waves having a discrete energy spectrum with substantially the same power at each frequency of energy concentration, and varying the amounts o1' power at the different frequencies in said waves in accordance with variations in said energy pattern.
17. The method which comprises deriving from a voice wave an energy pattern varying in a limited number of' respects of the order of ten corresponding to a limited number of muscular changes used in speech production and with a limited time rate of change in each respect corresponding to the limited rate, not exceeding approximately ten cycles per seconds, at which the speech producing muscles can be varied in producing speech, deriving from said voice vwave a wave having a frequency equal to the fundamental frequency of the vibrations o1' the vocal cords of the talker and waves of amplitudes equal to the amplitude of the last-mentioned wave and frequencies of upper harmonics of the last-mentioned wave, and varying the amounts of power at the different frequencies in said derived waves in accordance with variations in said energy pattern.
18. A system comprising means for creating an energy pattern varying in a. limited number of respects of theorder of ten corresponding to a limited number of muscular changes used in speechproduction and with a limited time rate of change in each respect corresponding to the limited rate, not exceeding approximately ten cycles per second, at which the speech producing muscles can be varied in producing speech, a source of Waves generating independently of the energy of said energy pattern waves having a continuous energy spectrum and substantially uniform distribution of power with frequency, and means responsive to the variations in said energy pattern for varying the amounts of power at the different frequencies in said waves in accordance with variations in said energy pattern.
19. The method of obtaining secrecy in communication which comprises transforming a set of communication signals containing modulated waves whose modulations correspond to other waves of lower frequency into a set of defining signals defining said other waves but not said 'first-mentioned Waves, and transmitting to a distance said dening signals to the exclusion of v both said communication signals and said rstmentioned waves as a substitute for said communication signals.
product of frequency band by time'of the reprof duced speech signals.
' '21.' A 'system'for communicating a speech signal comprising means for transforming said signal into Waves dening the sluggish muscular movements involved in the production of the speech signal, means for transmitting said waves to the exclusion of waves corresponding to acoustic'pulsationsresulting from the rapid vibrations of vocal ycords in producing said signal, and means responsive to said transmitted' waves for rreconstructing said signal.
f frequency range than thatfof'said speech signals 22. The method ywhich comprises analyzing ay complex communication signal wave having' known oscillatory characteristics and unknown variable'modulatory characteristics into defining signals exclusive of every frequency component rof the complexfsignal wave which do not define the known oscillatory'characteristics of the complex signal wave but define the unknown variable modulatory characteristics of the complex signalk wave required to reconstruct the signal.
23; A speech transmission system comprising means responsive to speech signals for producing rwaves whose yfrequency components differ fromy each of those of said speech signals inform and rin position in the spectrum and have less total and certain of which define the amplitude patternand certain others of which define the type of the frequency pattern yof'said speech signals,
and means responsive to said produced'waves for reproducing said speech signals.
24. The combination of ya system for transmitting a complex signal and means for supply-y ing said signal to said system, said system comprising means for subdividing said supplied signal into frequency subbands of the signal frequency band, a source of waves of frequency subbands corresponding to said frequency subbands of the signal, respectively, and means responsive to the waves of the latter frequency subbands for varying the power of the frequency subbands of the waves from said source at the syllabic time rates of change of the power in the respectively corresponding frequency subbands of said signal.
25. In combination, a source of speech signal waves and means connected to receive said waves and producing in response thereto waves having amplitude variations corresponding to frequency variations of the fundamental frequency component of said speech signal waves.
26. In combination, a source of waves having amplitude variations corresponding to frequency variations of the fundamental frequency component of speech sounds, and means for translating said amplitude variations into frequency variations.
27. In combination, a source of waves having amplitude variations corresponding to frequency variations of the fundamental frequency component of speech sounds, and means responsive to said waves for generating waves of the frequencies of said component and harmonics of the last-mentioned waves. f
28. The method of reproducing speech signals which comprises deriving therefrom dening waves which define the unpredictable characteristics of the speech signals and whose product of ytime duration and totaly frequency range has an order of magnitudeinferiorto that of the speech signals whose unpredictable characteristics are f defined byy said waves, and reproducing said sig nals from said waves. f f f 29. In combination,-means for producing a ycomplex Wavey having a fundamental component of varying frequency and harmonics thereof each of substantially the same power as said fundamental component, means for separately amplifying waves of differentr frequency subbandsy of y said complex wave, means for varying the amount of the amplification of the waves of the respective Asulal'iands at frequencies lowy compared to the frequency of said fundamental component, f rand means forcombining the amplified waves to forma resultant wave. Y ,Y f
30. In combination, means for producing ar complex wavey having a continuous energy spectrum and having substantially vuniform distribuf tion of power with frequency, meansr for separately amplifying waves of different frequency subbands of said complexy wave, means for vary-r ing the amount of the amplification of the waves yof the respective subbands, and means for comtion of the band, a circuity responsive toy said f i subband for obtaining the fundamental frequency component vof the 'band toy theexclusion of other components, said circuitr comprising ay y detector responsive to said subband and, in tandem therewith,y equalizing meansy yhaving attenuation increasing with frequency'and a constant output level amplifier, a detector responsive to waves from said circuit, and a low-pass filter fed from said second detector. y
32. A two-way telephone transmission system comprising a transmitting medium unsatisfactory for transmission of the original speech signals because -of their frequency band width, and two two- Way terminals each for transmission to and reception from said medium; each of `said terminals comprising a two-way speech frequency transmission circuit and a transmitting circuit and a receiving circuit connected between said speech frequency transmission circuit and said medium; each of said transmitting circuits comprising a speech frequency band compressor and each of said receiving circuits comprising a speech frequency band restorer; each of said compressors comprising a transmitting frequency pattern control circuit and a transmitting amplitude pattern control circuit, and each of said restorers comprising an energy source of frequency patterns, a receiving frequency pattern control circuit and a receiving amplitude pattern control circuit; at one of said terminals the transmitting frequency pattern control circuit comprising means connected to the speech frequency transmitting circuit for selecting from a. speech frequency band of waves a frequency subband from which the fundamental component of the speech signals can be derived, a circuit for deriving said fundamental component to the exclusion of other waves, means for detecting the syllabic frequency amplitude variations of said fundamental component, and means responsive to said variations for transmitting to said medium waves representing said variations; at the other of said terminals the receiving frequency pattern control circuit comprising means responsive to the latter waves for reproducing said variations and the energy source of frequency patterns comprising a source of power having a continuous spectrum, a source of power adding its output energy to that of said continuous spectrum source and having a discrete spectrum including a variable frequency fundamental component and harmonics thereof, and means responsive vto said reproduced amplitude variations for rendering said continuous spectrum source ineffective, rendering said discrete spectrum source eective, and controlling the frequency of said fundamental and harmonic components in accordance with the frequency variations of said derived fundamental component of the speech signals at said one terminal; at said one terminal the transmitting amplitude pattern control circuit comprising a number of transmitting amplitude pattern control channels of the order of ten and a delay equalizer connecting them to said speech frequency transmission circuit; each channel comprising a speech frequency subband selector for selecting a different subband of said speech frequency band of waves, means for detecting syllabic frequency variations of the power in such subband, means for translating the detected waves to a given position'in the frequency spectrum, said positions being different for the respective channels, and a channel frequency band selector for transmitting the translated waves to said medium; at said other terminal the receiving amplitude pattern control circuit comprising a delay equalizer and receiving amplitude pattern control channels corresponding respectively to said transmitting amplitude pattern control channels, for receiving waves transmitted thereto from said transmitting amplitude pattern controhchannels through said medium; each receiving amplitude pattern control channel comprising a receiving amplitude pattern control channel frequency band selector selecting thev band transmitted by the channel frequency band selector of the corresponding transmitting amplitude pattern control channel, means for restoring the waves received from the receivingl amplitude pattern control channel frequency band selector to their original position in the frequency spectrum, means responsive to the restored waves for controlling the power level in a subband of the frequency band of waves from said energy source of frequency patterns and transmitting the controlled subband of waves to the speech frequency transmission circuit, said subband having the frequency limits of the speech frequency subband selector of the transmitting amplitude pattern control channel corresponding to the receiving amplitude pattern control channel.
33. The method of reproducing a. signal containing variable information and invariable information and represented by a complex wave which comprises deriving-from the complexwave a simple set of parameters having approximately the number of degrees of freedom of the variable elements of the signal source, deriving from said parameters a set of defining Waves that respectively deiine the variations of said parameters, transmitting through a wave transmission medium said defining waves to the exclusion of Waves representing said invariable information, producing artificially waves representing said invariable information to the exclusion of said variable information, and combining effects of said articially produced Waves and said transmitted waves to reproduce said signal.
34. The method which comprises analyzing a speech signal into two factors one of which represents the natural fiow of energy from the mechanical-acoustical oscillating system comprising the vocal system of a man and the other of which represents the rational control applied to vary this iiow of energy so as to impress an information content on it, transmitting signals derived essentially from the second factor alone to a distant point, and there originating an energy stream equivalent to said natural flow of energy before said information content was -impressed on it and using the transmitted signals to control the energy stream in such way as to reproduce the original speech signals.
35. The method lof analyzing a. signal having a. fundamental component of varying frequency and harmonics thereof lying in the frequency range of the varying fundamental component, which varying frequency fundamental component of the voiced speech sounds to the exclusion of the other components of the speech sounds and utilizing said selected component to create a wave consisting essentially of a fundamental component of said varying frequency and harmonics thereof.
37. The method which comprises producing waves containing the fundamental frequency component of waves representing voiced sounds by intermodulation of upper harmonics of the varying 'frequency fundamental component, and suppressing the harmonics by transmitting the modulation products with transmission efficiency decreasing as frequency increases over a substantial frequency range including a major portion of the frequency range common to the varying frequency fundamental component and its harmonics. Y
38. The method of reproducing signals which comprises deriving, from the particular signal being reproduced, waves which are a function of said signal, producing under control of said sig- .nal, waves of character dependent upon the character of the signal source but not upon the particular signal being reproduced; and combining the first-mentioned waves and the secondmentioned waves to reproduce the signal.
39. The method which comprises creating a set of waves defining the frequency-time content of a signal, creating a. set of Waves defining the amplitude-time pattern of the signal in a series of .frequency bands, the summation of the products of the frequency bands by time of said two sets of waves being less than the product of the frequency band by time of lsaid signal, and utilizing said twoY sets of waves to construct said signal.
40. In combination, a source of waves defining the frequency-time content of a signal, a source of waves defining the amplitude-time pattern o1' the signal in a series of frequency bands, the summation of the vproducts of the frequency bands by time of said two sources being less than the product of the frequency band by time of said signal, and means controlled by said two sources for reconstructing said signal.
. 41. A vprivacy method for-'speech transmission 75
|Citing Patent||Filing date||Publication date||Applicant||Title|
|US2439293 *||May 30, 1945||Apr 6, 1948||Radio Patents Corp||Speech transmission system|
|US2458227 *||Apr 18, 1946||Jan 4, 1949||Hartford Nat Bank & Trust Co||Device for artificially generating speech sounds by electrical means|
|US2466880 *||Dec 17, 1946||Apr 12, 1949||Bell Telephone Labor Inc||Speech analysis and synthesis system|
|US2486106 *||Feb 23, 1944||Oct 25, 1949||Brown Robert H||Noise generating system|
|US2512889 *||Mar 20, 1945||Jun 27, 1950||Dreyfus Jean Albert||Electroacoustic apparatus|
|US2522539 *||Jul 2, 1948||Sep 19, 1950||Bell Telephone Labor Inc||Frequency control for synthesizing systems|
|US2561478 *||May 28, 1948||Jul 24, 1951||Bell Telephone Labor Inc||Analyzing system for determining the fundamental frequency of a complex wave|
|US2562109 *||Apr 30, 1948||Jul 24, 1951||Bell Telephone Labor Inc||Signal wave analyzer for deriving pitch information|
|US2575909 *||Jul 1, 1949||Nov 20, 1951||Bell Telephone Labor Inc||Voice-operated system|
|US2575910 *||Sep 21, 1949||Nov 20, 1951||Bell Telephone Labor Inc||Voice-operated signaling system|
|US2593694 *||Mar 26, 1948||Apr 22, 1952||Bell Telephone Labor Inc||Wave analyzer for determining fundamental frequency of a complex wave|
|US2593695 *||May 10, 1948||Apr 22, 1952||Bell Telephone Labor Inc||Analyzer for determining the fundamental frequency of a complex wave|
|US2593698 *||May 10, 1948||Apr 22, 1952||Beil Telephone Lab Inc||Apparatus for determining pitch frequency in a complex wave|
|US2635146 *||Dec 15, 1949||Apr 14, 1953||Bell Telephone Labor Inc||Speech analyzing and synthesizing communication system|
|US2640880 *||Jul 24, 1948||Jun 2, 1953||Speech communication system|
|US2766325 *||Sep 1, 1951||Oct 9, 1956||Itt||Narrow band communication system|
|US2866000 *||Sep 28, 1953||Dec 23, 1958||Lenkurt Electric Co Inc||Carrier communication system|
|US2866001 *||Mar 5, 1957||Dec 23, 1958||Caldwell P Smith||Automatic voice equalizer|
|US2881257 *||Aug 16, 1956||Apr 7, 1959||Bell Telephone Labor Inc||Spectrum synthesizer|
|US2883465 *||Dec 17, 1953||Apr 21, 1959||Vilbig Friedrich||Frequency band transformer|
|US2892892 *||Oct 7, 1955||Jun 30, 1959||Bell Telephone Labor Inc||Vocoder absorption modulation system|
|US2902543 *||May 14, 1957||Sep 1, 1959||Friedrich Vilbig||Signal transmission|
|US2911476 *||Apr 24, 1956||Nov 3, 1959||Bell Telephone Labor Inc||Reduction of redundancy and bandwidth|
|US2928901 *||Apr 13, 1956||Mar 15, 1960||Bell Telephone Labor Inc||Transmission and reconstruction of artificial speech|
|US2928902 *||May 14, 1957||Mar 15, 1960||Friedrich Vilbig||Signal transmission|
|US2938079 *||Jan 29, 1957||May 24, 1960||James L Flanagan||Spectrum segmentation system for the automatic extraction of formant frequencies from human speech|
|US2958039 *||May 18, 1956||Oct 25, 1960||Univ California||Delay line time compressor|
|US3004459 *||Dec 31, 1956||Oct 17, 1961||Baldwin Piano Co||Modulation system|
|US3030168 *||May 2, 1957||Apr 17, 1962||Hurvitz Hyman||Multiple stylus recorder for analog signals|
|US3030450 *||Nov 17, 1958||Apr 17, 1962||Bell Telephone Labor Inc||Band compression system|
|US3078345 *||Jul 31, 1958||Feb 19, 1963||Melpar Inc||Speech compression systems|
|US3090837 *||Apr 29, 1959||May 21, 1963||Ibm||Speech bandwidth compression system|
|US3102929 *||Mar 26, 1959||Sep 3, 1963||Philips Corp||Speech-signal transmission system|
|US3124654 *||May 22, 1961||Mar 10, 1964||Transmitter|
|US3162819 *||Jun 27, 1960||Dec 22, 1964||Bell Telephone Labor Inc||Polyphase demodulation|
|US3246084 *||Aug 26, 1960||Apr 12, 1966||Bolt Beranek & Newman||Method of and apparatus for speech compression and the like|
|US3373245 *||Aug 27, 1942||Mar 12, 1968||Bell Telephone Labor Inc||Production of current of random variation|
|US3470323 *||Jun 30, 1944||Sep 30, 1969||Bell Telephone Labor Inc||Signaling system|
|US3509281 *||Sep 29, 1966||Apr 28, 1970||Ibm||Voicing detection system|
|US3924074 *||May 19, 1945||Dec 2, 1975||Bell Telephone Labor Inc||Pulse position modulation key signaling system|
|US3965296 *||Jun 30, 1944||Jun 22, 1976||Bell Telephone Laboratories, Incorporated||Signaling system|
|US3979558 *||Jun 30, 1944||Sep 7, 1976||Bell Telephone Laboratories, Incorporated||Signaling system|
|US5133010 *||Feb 21, 1990||Jul 21, 1992||Motorola, Inc.||Method and apparatus for synthesizing speech without voicing or pitch information|
|US5471527 *||Dec 2, 1993||Nov 28, 1995||Dsc Communications Corporation||Voice enhancement system and method|
|US5940791 *||May 9, 1997||Aug 17, 1999||Washington University||Method and apparatus for speech analysis and synthesis using lattice ladder notch filters|
|US6256609||May 11, 1998||Jul 3, 2001||Washington University||Method and apparatus for speaker recognition using lattice-ladder filters|
|EP0255524A1 *||Dec 22, 1986||Feb 10, 1988||Motorola Inc||Method and apparatus for synthesizing speech without voicing or pitch information.|
|EP0255524A4 *||Dec 22, 1986||Jun 23, 1988||Motorola Inc||Method and apparatus for synthesizing speech without voicing or pitch information.|
|U.S. Classification||380/39, 324/76.29, 379/414, 324/76.12, 704/205, 704/209, 704/206, 84/701, 84/DIG.110|
|Cooperative Classification||Y10S84/11, H04B1/667|