WO2000070602A1

WO2000070602A1 - Method of evaluating the rhythmicity of a digital signal composed of samples

Info

Publication number: WO2000070602A1
Application number: PCT/FI2000/000445
Authority: WO
Inventors: Paavo Eskelinen
Original assignee: Voxlab Oy
Priority date: 1999-05-18
Filing date: 2000-05-17
Publication date: 2000-11-23
Also published as: AU4572400A; EP1210710A1; FI991132A0; FI991132A

Abstract

The invention relates to a method, apparatus and memory means for evaluating the rhythmicity of a digital signal (200) composed of samples (204). The method comprises the following steps of: (102) setting the next period of the signal under examination; (104) finding local extreme values of the samples: (106) computing the temporal distance between each two adjoining extreme value samples in turn; (108) if at least two adjoining temporal distances differ from each other at most by a preset limit, the extreme value samples used for computing said temporal distances constitute a rhythmic series; (110) repeating the steps of the method until the whole signal has been examined.

Description

METHOD OF EVALUATING THE RHYTHMICITY OF A DIGITAL SIGNAL COMPOSED OF SAMPLES

FIELD OF THE INVENTION

The invention relates to a method of evaluating the rhythmicity of a digital signal composed of samples.

BACKGROUND OF THE INVENTION

Rhythmicity of a digital signal composed of samples means that a certain pattern recurs regularly in the signal. For example, in a signal formed of the heart rate a stronger signal point recurs at regular intervals at the fre- quency of the heart rate. In a signal formed of speech the voicedness of speed is expressed as the rhythmicity of pitch.

Prior art methods of evaluating the basic rhythmicity of a signal are described in Lawrence R. Rabiner et al: A Comparative Performance Study of Several Pitch Detection Algorithms, in IEEE Transactions on Acoustics, Speech and Signal Processing, October 1976.

A problem related to the prior art solutions is that they require a lot of calculation capacity and are thus not suitable for simultaneous real time evaluation of the rhythmicity of several signals by a device with a relatively low capacity, such as a personal computer, or they are too inaccurate for use in practical applications. Neither do they identify momentary changes in the pitch, but provide an average value of the pitch.

BRIEF DESCRIPTION OF THE INVENTION

The object of the invention is to provide a method and an apparatus implementing the method to solve the above-mentioned problems. This is achieved with the method to be described in the following. This is a method of evaluating the rhythmicity of a digital signal composed of samples. The method comprises the following steps of: setting the next period of the signal under examination; finding local extreme values of the samples; computing the temporal distance between each two adjoining extreme value samples in turn; if at least two adjoining temporal distances differ from each other at most by a preset limit, the extreme value samples used for computing of said distances constitute a rhythmic series; repeating the steps of the method until the signal has been examined. The invention also relates to an apparatus for evaluating the rhythmicity of a digital signal composed of samples, comprising: a signal input; sampling means for taking samples from the signal at a certain frequency; a microprocessor for processing the samples. The microprocessor is arranged: to divide the signal into at least one period; find local extreme values of the samples; compute the temporal difference between each two adjoining extreme value samples in turn; to conclude: if at least two adjoining temporal differences, differ from each other at most by a preset limit, the extreme value samples used for computing said distances constitute a rhythmic series. The invention further relates to a memory means that can be read by a computer, comprising a computer program which is to be run in a computer and performs the method of evaluating the rhythmicity of a digital signal composed of samples. The method comprises the steps of: setting the next period of the signal under examination; finding local extreme values of the samples; computing the temporal difference between each two adjoining extreme value samples in turn; if at least two adjoining temporal distances differ from each other at most by a preset limit, the extreme value samples used for the computation of said distances constitute a rhythmic series; repeating the steps of the method until the signal has been examined The preferred embodiments of the invention are disclosed in the dependent claims.

The invention is based on finding rhythmic series in a signal by means of a simple method which comprises comparing differences in temporal distances between the extreme value samples of a signal with one another. The method and apparatus according to the invention provide several advantages. The method utilizes the calculation capacity of a microprocessor efficiently. If necessary, even an ordinary personal computer can be used for finding rhythmic series in several hundreds of different signals in parallel and in real time. The method is also accurate and provides a detailed de- sc ption of momentary changes in rhythmicity.

BRIEF DESCRIPTION OF THE DRAWINGS

The invention will be described in greater detail by means of preferred embodiments with reference to the accompanying drawings, in which

Figure 1 A is a flow chart illustrating the steps of a method according to the invention; Figure 1 B illustrates a first preferred principal embodiment of the method according to the invention;

Figure 1C illustrates a second preferred principal embodiment of the method according to the invention; Figures 2A and 2B illustrate how samples are taken from a signal;

Figures 3A and 3B illustrate evaluation of a signal to find rhythmicity according to the invention;

Figure 4 illustrates the essential components of an apparatus implementing the invention; Figure 5 illustrates a method of finding extreme value samples.

DETAILED DESCRIPTION OF THE INVENTION

Figure 2A illustrates a signal 200 placed in a system of coordinates where the x-axis represents time T and the y-axis the signal value V at any given moment. The signal 200 in question is an analogue signal which can be digitised in the manner to be shown in Figure 2B. At a certain frequency 202 samples 204 are taken from the signal, which are illustrated in the figure with a line the end of which is provided with a black dot. In the example of Figure 2B the signal 200 is illustrated by the 24 samples shown. The signal 200 has been scaled to be placed in the system of coordinates so that the point where the x- axis crosses the y-axis is the same as the mean value of the signal. Thus the mean value of the signal is represented by the x-axis 206.

In some cases the sample 204 values of the signal 200 in Figure 2B can be interpreted as being represented by different voltages. In that case the y-axis represents the voltage values. The mean value 206 of the signal 200 is zero volts and the signal may receive values from the negative voltage values below the x-axis to the positive voltage values above the x-axis.

The sample 204 values of the signal 200 can also be described otherwise, e.g. by scaling all values to positive or negative voltage values. Another way of illustrating the sample 204 values is in an internal presentation form of computer, e.g. as hexadecimal, octal or binary numbers. It is irrelevant to the invention how the sample 204 values are described, provided that the values are presented so that their order of magnitude and the temporal distance between the samples 204 can be determined. Referring to Figures 1A, 3A and 3B, the method of examining possible rhythmicity included in a signal according to the invention will be described in the following.

Figure 1A illustrates the steps of the method. The method starts in block 100.

First, the next period 320 of the signal 200 is set under examination in block 102. This means that the signal is divided into subsequent periods which are examined in turn. According to the invention, the whole signal can also be examined in one go. A third possibility is to use a sliding window in which a small portion of the signal is examined, which is followed by examination of the next small portion. This method requires slightly more memory and calculation capacity than the first method since temporary information needs to be stored on how the examination proceeds during the processing.

The examples illustrate the first method, i.e. 24 samples 204 ob- tained from the signal 200 are examined in two different periods 320 and 322. Next, the local extreme values of the samples are searched for in block 104. In the first preferred principal embodiment the function of block 104 is performed in block 120 illustrated in Figure 1 B in which extreme value samples 301 , 302, 303, 304, 305, 306 that deviate most from the mean value 206 of the samples 204 are searched for. In this example these are the samples that receive the highest and the lowest values. Preferably a predetermined number of samples are selected as extreme value samples 301 , 302, 303, 304, 305, 306, although other selection criteria can also be used. The predetermined number of extreme value samples 311 preferably depends on the type of rhythmicity. In this example six extreme value samples are selected from among each period 320, 322. For the sake of clarity, Figure 3B shows only the selected extreme value samples 301 , 302, 303, 304, 305, 306.

Then in block 106 the temporal distance 330, 332, 334, 336, 338 between each two adjoining extreme value samples, i.e. samples 301 to 302, 302 to 303, 303 to 304, 304 to 305 and 305 to 306, is computed in turn.

In block 108 it is finally concluded: if at least two adjoining temporal distances differ from each other at most by a preset limit, the extreme value samples 303, 304, 305, 306 used for computing these distances constitute a rhythmic series. It can be seen in the figure that the temporal distances 334, 336, 338 are the same, i.e. twice the sampling frequency 202. Extreme value samples 303, 304, 305, 306 were used in the computation of these distances 334, 336, 338, which means that these samples constitute a rhythmic series.

In block 110 it is checked whether the whole signal 200 has been examined. If the whole signal 200 has been examined, we proceed to block 112 where the method ends. If the whole signal 200 has not been examined yet, we return to block 102 where the period 322 following the examined period 320 will be set under examination.

It should be noted that the example described is a rough simplification for the sake of clarity. A real signal may contain e.g. 8000 samples per second. Preferably the maximum and the minimum limits set on the deviation in temporal distance and the respective tolerances depend on the type of rhythmicity.

When the examination of the signal 200 continues by processing of the next period 322, there are different options of selecting the starting point of the next period 322.

In a preferred embodiment the starting point of the next period 322 is the sample 360 that follows the last sample 306 of the preceding period 320. In that case the signal 200 is examined in periods 320, 322 that follow each other in time. Problems are caused by controlling of the limits: a rhythmic se- ries may continue beyond the limits between the periods 320, 322. This can, however, be controlled by comparing the information of the preceding period 320 with that of the following period 322.

In a preferred embodiment the starting point of the next period 322 is the last extreme value sample 306 of the rhythmic series found in the pre- ceding period 320. Thus the periods 320, 322 overlap to some extent, which makes the examination of the signal 200 slower, but may simplify the controlling of the limits.

In a preferred embodiment the starting point of the next period 322 is after the first sample 362 of the preceding period 320. In this case the peri- ods 320, 322 may overlap as much as desired, regardless of whether rhythmic series were found or not in the processing of the preceding period 320.

The example illustrated in Figure 3B describes a preferred embodiment. Again, the second period 322 is searched for six extreme signal values 311 , 312, 313, 314, 315, 316. The temporal distances between these extreme signal values 311 , 312, 313, 314, 315, 316 are denoted by 340, 342, 344, 346, 348. It can also be seen that Figure 3B includes a rhythmic series which con- tinues beyond two periods 320, 322 because the temporal distance 350 between the last extreme value sample 306 of the last rhythmic series found in the preceding period 320 and the first extreme value sample 311 of the new period 322 differs at most by a preset limit from the distances 334, 336, 338, 340 between the other extreme value samples 303, 304, 305, 306, 311 , 312 belonging to the series found. Thus the rhythmic series found consists of six extreme value samples 303, 304, 305, 306, 311 , 312.

A second rhythmic series is also found in the second period 322, the series consisting of extreme value samples 314, 315, 316 between which the temporal distances 346 and 348 are equal. It is also interesting to note that a rhythmic series may also consist of both highest values 314, 316 and lowest values 315.

It is also possible that one finds a series consisting of positive peaks, and a series consisting of negative peaks. If these series overlap tem- porally, this strongly indicates that a series has been found.

The series found can be processed further according to the prior art to find out whether it constitutes one or more longer basic series. In the further processing it can also be analysed whether a real series was found, because a series may be e.g. too short considering the character of the signal to be examined.

Figure 5 illustrates a second preferred principal embodiment of the invention. The steps of this principal embodiment are shown in Figure 1C in which the function of block 104 is performed in blocks 140, 142 and 144. Figure 5 corresponds to Figure 2A, i.e. the signal to be examined has been placed in the system of coordinates, in which case the x-axis represents time T and the y-axis the signal value V at any given moment.

The method of Figure 1C is similar to the method shown in Figure 1A, except for the following refinements and changes. In block 140 the signal is searched for all local extreme value samples, i.e. local maximums (or mini- mums), in other words all the points at which the direction of the signal values changes. In Figure 5 these maximums are marked with a black dot 520.

Thereafter block 146 is used for repeating the steps of blocks 142, 144, 106 and 108 until a rhythmic series is found or until it is concluded that a rhythmic series can no longer be found. In block 142 each remaining extreme value sample is evaluated so that the extreme value sample having an ascending relation to the preceding extreme value sample receives a symbol indicating ascent, and the extreme value sample having a descending relation to the preceding extreme value sample receives a symbol indicating descent. Thus in block 142 the extreme value samples found are examined to find out what their relation to the preced- ing extreme value sample is. If the value of the maximum exceeds the value of the preceding maximum, the maximum is marked with a plus sign. Correspondingly, if the value is lower than the maximum, the maximum is marked with a minus sign. The first maximum to be examined is given a plus sign. All the maximums of the period to be examined are processed as described above. In the first round the maximums thus receive the following signs: +-+-

Then in block 144 the extreme value samples that were given one of the symbols are removed from processing, i.e. in our example those maximums that received a plus sign are selected for examination for the second round 508 and the extreme value samples that received a minus sign are removed from processing. In Figure 5 this is illustrated by circles around the plus signs. Subsequently, the functions of blocks 106, 108 described earlier are performed, i.e. it is checked whether the extreme value samples marked with plus signs form a rhythmic series. In block 146 it is checked whether the search for a rhythmic series should be continued. If a rhythmic series with a sufficiently good quality has been found, the search is not necessarily continued. On the other hand, if it is noticed that a rhythmic series can no longer be found, we proceed to block 112. The quality of a rhythmic series can be measured e.g. with the average magnitude of the differences between the temporal distances.

In our example a decision to continue was made in block 146 because a rhythmic series had not been found yet. During the new round 508 the maximums are processed again according to the rules described above. Thus the first maximum, for example, is again given a plus sign. The originally third maximum accepted for the second round receives a minus sign because its value is lower than that of the first maximum. Then the originally fifth maximum receives a minus sign because its value is lower than that of the originally third maximum. The originally sixth maximum received a plus sign because its value is higher than that of the originally fifth maximum. All thirteen maximums of the first round provided with a plus sign and selected for the second round are processed this way. The second round yields the following signs: +--++- +++-++-. Since no rhythmic series was found in the second round, the search is continued in a third round.

The maximums that were given a plus sign are selected for the third round 510 from the second round. The maximums selected from among the maximums of the second round 508 have been circled. The selected eight maximums are again processed according to the rules described above, which yield the following signs: +-+-++-+. This means that a rhythmic series was not found in the third round, either.

Five circled maximums provided with plus signs are selected for the fourth round 512 from the third round 510. Processing of the fourth round yields the following signs: ++-++. The four maximums provided with plus signs from the fourth round 512 are circled in Figure 5.

After this, the method of the invention continues normally from block 106. In other words, the distance 502 between the originally first maximum sample 500A and the eight 500B maximum sample, the distance 504 between the eight 500B and the sixteenth 500C maximum sample, and the distance between the sixteenth 500C and the twenty-second 500D maximum sample are compared with one another. Since these temporal distances 502, 504, 506 differ from one another only by a preset limit, the extreme value samples 500A, 500B, 500C and 500D in question constitute a rhythmic series according to block 108.

The second principal embodiment described above can also be implemented in a memory saving manner by originally inserting the samples into a table where each table index indicates the temporal location of the sample. For example, a sample with a length of 100 milliseconds provides 8000 points to be examined, each of the points requiring 16 bits, i.e. 128 kilobits of the storing capacity. From this point onwards the analysis described requires a two-dimensional table where one bit indicates the sign of the maximum, e.g. zero indicates the minus sign and one the plus sign. The second column of the table includes a pointer to the index corresponding to the maximum of the original table with the length of 128 kilobits.

Figure 4 illustrates an apparatus applying the method according to the invention. Only the relevant parts of the apparatus are described here, which are the input 400 of the signal to be examined, sampling means 402 for taking samples from the signal at a certain frequency 202, and microprocessor 404 for processing the samples employing the method described above. The invention is preferably implemented by means of software, in which case the invention requires functions in the microprocessor, the microprocessor 404 being consequently arranged to: divide the signal into at least one period, find local extreme values of the samples, compute the temporal distance between each two adjoining extreme value samples in turn, conclude: if at least two adjoining temporal distances differ from each other at most by a preset limit, the extreme value samples used for com- puting these distances constitute a rhythmic series.

Naturally, the invention can be implemented not only by means of software but also by means of asic (application specific integrated circuit) or separate logic consisting of hardware parts.

By means of software the invention can also be implemented by storing the program in a memory means that can be read by a general- purpose computer. In that case the computer program stored in the memory means performs the method of evaluating the rhythmicity of a digital signal composed of samples. The memory means may be e.g. a computer hard disk, disk, CD-ROM, or any other memory means in which a computer program can be stored and from which a computer can read the program and perform it.

Even though the invention has been described with reference to an example according to the accompanying drawings, it is clear that the invention is not limited thereto, but it can be modified in various ways within the scope of the inventive concept disclosed in the appended claims.

Claims

1. A method of evaluating the rhythmicity of a digital signal composed of samples, characterized in that the method comprises the steps of: (102) setting the next period of the signal under examination;

(104) finding local extreme values of the samples; (106) computing the temporal distance between each two adjoining extreme value samples in turn;

(108) if at least two adjoining temporal distances differ from each other at most by a preset limit, the extreme value samples used for computing said temporal distances constitute a rhythmic series;

(110) repeating the steps of the method until the whole signal has been examined.

2. A method according to claim 1, characterized in that find- ing (104) of extreme value samples comprises searching for extreme value samples (301, 301, 303, 304, 305, 306) that differ most from the mean value (206) of the samples (204).

3. A method according to claim 2, characterized in that a predetermined number of samples are selected as extreme value samples (301 , 302, 3030, 304, 305, 306).

4. A method according to claim 3, characterized in that the predetermined number of extreme value samples (301, 302, 303, 304, 305, 306) depends on the type of rhythmicity.

5. A method according to claim ^ characterized in that the limit depends on the type of rhythmicity.

6. A method according to claim ^ characterized in that the starting point of the next period (322) is the sample (360) following the last sample (306) of the preceding period (320).

7. A method according to claim ^ characterized in that the starting point of the next period (322) is the last extreme value sample (306) of the rhythmic series found in the preceding period (320).

8. A method according to claim 1, characterized in the starting point of the next period (322) is after the first sample (362) of the preceding period (320).

9. A method according to claim 1 , c h a r a c t e r i z e d in that the rhythmic series is the same series that continues at least beyond two periods (320, 322) if the temporal distance (350) between the last extreme value sample (306) of the last rhythmic series found in the preceding period (320) and an extreme value sample (311 ) in the new period (322) differs at most by a preset limit from the temporal distances (334, 336, 338, 340) of the other extreme value samples (303, 304, 305, 306, 311 , 312) belonging to the series found.

10. A method according to claim 1 , c h a r a c t e r i z e d in that finding (104) of extreme value samples comprises (140) finding all local extreme value samples, and

(146) repeating the following steps until a rhythmic series is found or until it is concluded that a rhythmic series can no longer be found:

(142) evaluating each remaining extreme value sample in turn so that an extreme value sample having an ascending relation to the preceding extreme value sample receives a symbol indicating ascent, and an extreme value sample having a descending relation to the preceding extreme value sample receives a symbol indicating descent;

(144) removing the extreme value samples that received one of the symbols from the processing; and performing the two steps (106, 108) that follow finding

(104) of the extreme value samples.

11. An apparatus for evaluating the rhythmicity of a digital signal composed of samples, comprising: a signal input (400); sampling means (402) for taking samples from the signal at a certain frequency; a microprocessor (404) for processing the samples; c h a r a c t e r i z e d in that the microprocessor (404) is arranged to: divide the signal into at least one period; find local extreme values of the samples; compute the temporal distance between each two adjoining extreme value samples in turn; conclude: if at least two adjoining temporal distances differ from each other at most by a preset limit, the extreme value samples used for computing said distances constitute a rhythmic series.

12. An apparatus according to claim 11, characterized in that to find the local extreme values of the samples the microprocessor (404) is arranged to find the extreme value samples (301, 302, 303, 304, 305, 306) that differ most from the mean value (206) of the samples (204).

13. An apparatus according to claim 12, characterized in that the microprocessor (404) is arranged to select a predetermined number of samples as the extreme value samples (301, 302, 303, 304, 305, 306).

14. An apparatus according to claim 13, characterized in that the microprocessor (404) is arranged to function so that the predeter- mined number of extreme value samples (301, 302, 303, 304, 305, 306) depends on the type of rhythmicity.

15. An apparatus according to claim 11, characterized in that the microprocessor (404) is arranged to function so that the limit depends on the type of rhythmicity.

16. An apparatus according to claim 11, characterized in that the microprocessor (404) is arrange dot function so that the starting point of the next period (322) is the sample (360) following the last sample (306) of the preceding period (320).

17. An apparatus according to claim 11, characterized in that the microprocessor (404) is arranged to function so that the starting point of the next period (322) is the last extreme value sample (306) of the rhythmic series found in the preceding period (320).

18. An apparatus according to claim 11, characterized in that the microprocessor (404) is arranged to function so that the starting point of the next period (322) is after the first sample (362) of the preceding period (320).

19. An apparatus according to claim 11, characterized in that the microprocessor (404) is arranged to conclude: the rhythmic series is the same series that continues at least beyond two periods (320, 322) if the temporal distance (350) between the last extreme value sample (306) of the last rhythmic series found in the preceding period (320) and an extreme value sample (311) in the new period (322) differs at most by a preset limit from the temporal distances (334, 336, 338, 340) of the other extreme value samples (303, 304, 305, 306, 311, 312) belonging to the series found.

20. An apparatus according to claim 11 , c h a r a c t e r i z e d in that to find the local extreme values of the samples the microprocessor (404) is arranged to: find all local extreme sample values, and repeat the following steps until a rhythmic series is found or until it is concluded that a rhythmic series can no longer be found: evaluate each remaining extreme value sample in turn so that an extreme value sample having an ascending relation to the preceding extreme value sample is receives a symbol indicating ascent, and an ex- treme value sample having a descending relation to the preceding extreme value sample receives a symbol indicating descent; remove the extreme value samples that received one of the symbols from the processing; and compute the temporal distance between each two ad- joining extreme value samples in turn; and conclude: if at least two adjoining temporal distances differ from each other at most by a preset limit, the extreme value samples used for computing said distances constitute a rhythmic series.

21. A memory means which can be read by a computer, comprising a computer program to be performed by a computer, the program performing the method of evaluating the rhythmicity of a digital signal (200) composed of samples (204), c h a r a c t e r i z e d in that the method comprises the steps of:

(102) setting the next period of the signal under examination; (104) finding local extreme values of the samples;

(106) computing the temporal distance between each two adjoining extreme value samples in turn;

22. A memory means according to claim 21 , c h a r a c t e r i z e d in that finding (104) of extreme value samples comprises searching for ex- treme value samples (301 , 301 , 303, 304, 305, 306) that differ most from the mean value (206) of the samples (204).

23. A memory means according to claim 21 , c h a r a c t e r i z e d in that finding (104) of extreme value samples comprises

(140) finding all local extreme value samples, and (146) repeating the following steps until a rhythmic series is found or until it is concluded that a rhythmic series can no longer be found:

(142) evaluating each remaining extreme value sample in turn so that an extreme value sample having an ascending relation to the preceding extreme value sample receives a symbol indicating ascent, and an extreme value sample having a descending relation to the preceding ex- treme value sample receives a symbol indicating descent;

(144) removing the extreme value samples that received one of the symbols from the processing; and performing the two steps (106, 108) that follow finding (104) of the extreme value samples.