Publication number | WO1995034135 A1 |

Publication type | Application |

Application number | PCT/AU1995/000330 |

Publication date | Dec 14, 1995 |

Filing date | Jun 2, 1995 |

Priority date | Jun 3, 1994 |

Also published as | US5812072 |

Publication number | PCT/1995/330, PCT/AU/1995/000330, PCT/AU/1995/00330, PCT/AU/95/000330, PCT/AU/95/00330, PCT/AU1995/000330, PCT/AU1995/00330, PCT/AU1995000330, PCT/AU199500330, PCT/AU95/000330, PCT/AU95/00330, PCT/AU95000330, PCT/AU9500330, WO 1995/034135 A1, WO 1995034135 A1, WO 1995034135A1, WO 9534135 A1, WO 9534135A1, WO-A1-1995034135, WO-A1-9534135, WO1995/034135A1, WO1995034135 A1, WO1995034135A1, WO9534135 A1, WO9534135A1 |

Inventors | John Masters |

Applicant | John Masters |

Export Citation | BiBTeX, EndNote, RefMan |

Patent Citations (2), Referenced by (2), Classifications (8), Legal Events (8) | |

External Links: Patentscope, Espacenet | |

WO 1995034135 A1

Abstract

The present invention discloses a mathematical technique which finds substantial utility in electrical engineering - particularly in the general field of data transformation. Such data transformation occurs in electrical engineering applications such as data encryption and data compression. The technique uses a commencement matrix to transform a first number of a sequence of numbers. This initial transformation forms a forward resultant matrix. Successive transformation of succeeding numbers in the sequence produces for each number in the sequence an augmented forward resultant matrix. Each augmented forward resultant matrix is used to transform the next succeeding number of the sequence, and so on. A final transformed sequence is generated from the final form of the augmented forward resultant matrix.

Claims (OCR text may contain errors)

1. A method of generating a digitally encoded electric output signal representing a transformation of a digitally encoded electric input signal representing a sequence of numbers, said method comprising the steps of:

(i) identifying each said number of said sequence represented in said input signal;

(ii) coupling a first said number of said sequence with a predetermined commencement matrix, said coupling comprising the forward application of a reversible mathematical procedure, to form a forward resultant matrix;

(iii) coupling each succeeding number of said sequence with the forward resultant matrix of the preceding step to form an augmented forward resultant matrix;

(iv) carrying out step (iii) for each remaining number of said sequence in turn to form a final augmented forward resultant matrix; and

(v) forming said output signal from the contents of said final augmented forward resultant matrix.

2. A method of generating a digitally encoded output signal representing a reverse transformation to that described above using as an input the output of the above described method in order to generate an output electric signal which is digitally encoded to represent said sequence of numbers, said method comprising the steps of:

(i) identifying from the output signal of the above described method the contents of said final augmented forward resultant matrix;

(ii) applying in the reverse direction said reversible mathematical procedure to the contents of said final augmented forward resultant matrix to emit a number and form a reverse resultant matrix;

(iii) applying in the reverse direction said reversible mathematical procedure to the contents of said reverse resultant matrix to emit a number and form an augmented reverse resultant matrix;

(iv) carrying out step (iii) for each remaining augmented reverse resultant matrix until said predetermined commencement matrix is generated; and

(v) forming said output electric signal by representing said emitted numbers in the sequence of their emission.

3. The method of claim 1 or 2, wherein the reversible mathematical procedures comprise linear operators.

4. The method of claim 1 or 2, wherein the reversible mathematical forward procedure is selected from the set consisting of:

(i) addition (subtraction)

(ii) multiplication (division), and (iii) exponentiation (taking the logarithm) and the corresponding reverse procedure is set out in parenthesises.

5. The method of any one of claims 1 to 3, wherein the predetermined commencement matrix is either the unity matrix or the product of the unity matrix and two.

6. The method of any one of claims 1 to 4, wherein the final augmented forward matrix is transposed to emit the sequence in the reverse transformation in the reverse order.

7. The method of any one of claims 1 to 5, wherein the final augmented matrix is further transformed by means of a Z transformation and if necessary repeatedly Z transformed, to arrive at a single number which represents the original sequence of numbers.

8. The method of claim 7, wherein the sequence is recovered from the single number.

9. A method of generating a digitally encoded electric signal, said method being substantially as described with reference to the drawings.

10. A data encryption method comprising the step of carrying out the first method as set forth in any one of claims 1 and 3-8 on data comprising said sequence of numbers.

11. A data decryption method comprising the step of carrying out said second method as set forth in any one of claims 2-8 on said encrypted data.

12. A method of data compression comprising the step of carrying out the first method as set forth in any one of claims 1 and 3-8..

13. A method of data expansion comprising the step of carrying out the second method as set forth in paragraphs 2-8.

Description (OCR text may contain errors)

A DATA CONVERSION TECHNIQUE

The present invention relates to the physical applications to which a new mathematical algorithm can be applied. Such applications include data encryption and decryption, data compression, error correction and like applications which involve a transformation of data and the generation of electric signals.

Background Art

Electronic data processing and transmission equipment represents the data to be processed and/or transmitted as a sequence of binary digits in which both the magnitude of the digit (1 or 0) and its position in the sequence are required in order to preserve the necessary information.

Object Of The Invention

The present invention allows that binary data to be transformed whilst still maintaining the information inherent in the binary data. The inverse transformation so as to recover the binary data is also able to be achieved.

Summary Of The Invention

In accordance with the first aspect of the present invention there is disclosed a method of generating a digitally encoded electric output signal representing a transformation of a digitally encoded electric input signal representing a sequence of numbers, said method comprising the steps of:

(i) identifying each said number of said sequence represented in said input signal;

(ii) coupling a first said number of said sequence with a predetermined commencement matrix, said coupling comprising the forward application of a reversible mathematical procedure, to form a forward resultant matrix;

(iii) coupling each succeeding number of said sequence with the forward resultant matrix of the preceding step to form an augmented forward resultant matrix;

(iv) carrying out step (iii) for each remaining number of said sequence in turn to form a final augmented forward resultant matrix; and

(v) forming said output signal from the contents of said final augmented forward resultant matrix.

In accordance with a second aspect of the present invention there is disclosed a method of generating a digitally encoded output signal representing a reverse transformation to that described above using as an input the output of the above described method in order to generate an output electric signal which is digitally encoded to represent said sequence of numbers, said method comprising the steps of:

(i) identifying from the output signal of the above described method the contents of said final augmented forward resultant matrix; (ii) applying in the reverse direction said reversible mathematical procedure to the contents of said final augmented forward resultant matrix to emit a number and form a reverse resultant matrix;

(iii) applying in the reverse direction said reversible mathematical procedure to the contents of said reverse resultant matrix to emit a number and form an augmented reverse resultant matrix;

(iv) carrying out step (iii) for each remaining augmented reverse resultant matrix until said predetermined commencement matrix is generated; and

(v) forming said output electric signal by representing said emitted numbers in the sequence of their emission.

In accordance with a third aspect of the present invention there is disclosed a data encryption method comprising the step of carrying out said first method on data comprising said sequence of numbers.

In accordance with a fourth aspect of the present invention there is disclosed a data decryption method comprising the step of carrying out said second method on said encrypted data.

In accordance with a fifth aspect of the present invention there are disclosed methods of data compression and expansion comprising the steps of carrying out said first and second methods respectively.

The above described reversible mathematical forward procedure, and its corresponding reverse procedure set out in parenthesises, includes the following:

(i) addition (subtraction)

(ii) multiplication (division), and

(iii) exponentiation (taking the logarithm).

The predetermined commencement matrix can include the unity matrix and the product of the unity matrix and two.

The final augmented forward matrix can be transposed to emit the sequence in the reverse transformation in the reverse order.

The final augmented matrix can be further transformed by means of a Z transformation, and if necessary repeatedly Z transformed, to arrive at a single number which represents the original sequence of numbers. The sequence can be recovered from this single number.

Brief Description Of The Drawings

Embodiments of the present invention will now be described with reference to the drawings in which:

Fig. 1 is a schematic block diagram of a communication system incorporating encryption (encoding) and decryption (decoding) in accordance with one embodiment of the present invention; Figs. 2-4 are schematic block diagrams of encoding and decoding

arrangements for generating output electric signals from input electric signals;

Fig. 5 is a Table illustrating a still further data compressing transformation; and

Fig. 6 is a bar code representing an encoded sequence which in turn represents the letters DO NOT INCINERATE.

An Appendix to this specification contains a mathematic paper to be published by the inventor which details the mathematical basis underlying the present invention. Detailed Description

First Algorithm

In order to provide an initial insight into the algorithm lying behind the present invention, consider the binary sequence 01101 which uses 5 bits to represent the decimal number 13. Each bit of this sequence can now be "coupled" in turn with a matrix. The initial or commencement matrix is the matrix which represents one, unity or the identity matrix, viz:

After the initial bit has been "coupled" with the matrix the result is an amended or augmented matrix. Each subsequent bit is then coupled with the augmented matrix resulting from all the previous "couplings".

The "rule" for the coupling is as follows:-

(A) If the bit to be "coupled" is a zero, then add the top row of the matrix to the bottom row, placing the result in the bottom row, and leaving the top row unchanged; and

(B) If the bit to be "coupled" is a one, then add the bottom row of the matrix to the top row, placing the result in the top row, and leaving the bottom row unchanged.

Thus, for our example of 01101 the initial bit to be "coupled" is the least significant bit which is 1.

The starting matrix is and we add the bottom row to the top as in (B)

above giving an augmented forward resultant matrix

The next bit to be coupled is a zero, so we now add the top row to the bottom in accordance with (A) above giving

The next bit is a 1 and so the result isThe next bit to be coupled is also a 1 giving

The last bit to be coupled is a zero giving

So in accordance with the transformation, the binary sequence 01101 can be represented by the final forward augmented matrix

In order to reverse the transformation the "rule" to be followed is:

(C) If the sum of the numbers in the bottom row of the matrix is less than the sum of the numbers in the top row of the matrix, "emit" a one, and subtract the bottom row of the matrix from the top row, placing the result in the top row, and leaving the bottom row unchanged; and

(D) If the sum of the numbers in the top row of the matrix is less than the sum of the numbers in the bottom row of the matrix, "emit" a zero, and subtract the top row of the matrix from the bottom row of the matrix, placing the result in the bottom row, and leaving the top row unchanged.

The reverse transformation is completed when the commencement matrix in the form of identity matrix is reached.

Thus taking our previous final augmented forward matrix of the reverse

transformation is as follows:-

It will be seen that the emitted sequence is 10110 which is actually the reverse of the initial sequence. Emitting the "most significant bit" of the sequence first can have significant engineering advantages in storing digits in shift registers, however, this "reverse sequence" need not be obtained.

The sequence can be obtained in the correct order by the simple expedient of transposing the initial starting matrix (ie. the result of the original transformation).

Thus the matrix transposes to

If the latter matrix is used as the starting point in the reverse transformation, the result is as follows:-

It will be seen that the correct sequence is emitted 01101 with the "least significant bit" emitted first.

Manipulation of even lengthy sequences of binary data will lead to the observation that the only numbers generated within the matrix are relative prime numbers (ie. they have no common factors other than 1).

Having appreciated the transformation of the binary sequence 01101 into the matrix it will be further appreciated that the transformation need not stop

there.

In particular, for a matrix the product of the determinants equals the determinant of the product. Thus, for the matrix the following holds (ad - be) = 1

that is a = (be + 1)/d. (RULE E)

Similar expressions can be generated for each of b, c and d in terms of the other three numbers.

The first consequence of this result is that in order to store the resultant matrix it is necessary to store only 3 numbers, not 4. That is, the binary sequence 01101 which represents 13 in decimal can also be represented by the numbers 3, 5, 4 and 7 (order being important) or any three of those numbers (again order being important in the representation).

It will be apparent to those skilled in the art that in binary form a large number of variables (bits) having a limited range of values (0 or 1) can be used to represent information. However, in accordance with the present invention this information can be represented by a very small number of variables (eg. 4, 3 or even less as will be explained hereafter) each of which can have a value with a very large range of values. That is, the 5 bits of binary data can be stored as 3 or 4 numbers, however, as the number of bits increases the number of numbers used to represent the binary data does not increase. Instead, only the magnitude of these representative numbers increases.

With the above described embodiment of the algorithm of the present invention, a number of practical applications will now be described.

The first such practical application is that of data encryption and decryption. As schematically indicated in Fig. 1, a data transmission system at each of the "ends" 1 and 2 of the system takes the form of a keyboard 3 connected to an encoder 4 and a printer 5 connected to a decoder 6. Both the encoder 4 and decoder 6 are connected to a transceiver 7 which can transmit to, and receive from, a like transceiver 7. Although the transmission is schematically indicated as being a radio broadcast, transmission by wire is also possible. The encoders 4 and decoders 6, operate in accordance with the rules A, B, C and D as described above. The output of the keyboard 3 is in binary ASCII code, is converted into 4 decimal digits (or 3 if the above described storage reduction is utilized), which can then be converted into binary form for transmission. The process is reversed following reception which allows the data originally input into the keyboard 3 of, say, station 1, to be output by the printer 5 of station 2.

It will be apparent to those skilled in the art that the logic circuits necessary to create the encoders 4 and decoders 6 are quite straightforward and need not be described in any detail.

A second application which arises out of the first application is the generation and manipulation of digitally encoded electric waveforms. As seen in Fig. 1, an encoder 14 which operates in accordance with rules A and B given above is able to receive as an input a digitally encoded electric signal which represents the binary number 01101 (13 in decimal) and output four digitally encoded electric signals which respectively represent the numbers 011 (3 in decimal), 101 (5 in decimal), 100 (4 in decimal), and 111 (7 in decimal). These four signals can be output either in parallel form (as schematically indicated in Fig. 2) or in serial form. Fig. 3 illustrates the reverse process. Here a decoder 16 which operates in accordance witii rules C and D above, receives as input the four signals generated by the encoder 14 of Fig. 2. Again the signals can be input either in parallel or series form. The output of generator 16 is the input signal applied to encoder 14 which represents the binary number 01101 (13 in decimal).

Further, as indicated in Fig. 4, by use of a decoder 26 which operates in accordance with rules C, D and E above, then the desired output signal can be generated from only three input signals since the "missing" fourth input can be internally generated in accordance with rule E.

It will be apparent to those skilled in the art that the actual waveforms illustrated in Figs. 2-4 include small pips which are illustrated only for the purpose of making it easier for the reader to distinguish the start and end of each bit of the waveform. Again the internal logic circuitry of the encoder 14 and decoders 16,26 is quite straightforward. In this connection, see any undergraduate electrical engineering text such as:

"Digital Computer Fundamentals", Thomas C. Bartee, Published by McGraw- Hill Book Company;

"Logic Design with Integrated Circuits", William E Wickes, Published by John Wiley & Sons, Inc.

"Introduction to Switching Theory and Logical Design", Frederick J. Hill,

Ph.D., Gerald R Peterson, Ph.D., Published by John Wiley & Sons, Inc.

"Automata and Languages", John M Howie 1991, Clarendon Press, Oxford Science Publication.

A third practical application is data compression. In the example indicated above, the digital number 13 represented by 01101 occupies 13 bits and can be represented by the digital numbers 3, 5, 4, 7 which respectively occupy 2, 3, 2 and 3 bits making a total of 10 bits in all. This would be reduced to three numbers 5, 4, 7 utilizing the above described reduced storage which would require 8 bits. The fourth number (3) is given by 3 = (4 x 5 + 1)/7.

However, if storage other than conventional binary data storage is considered, then a large amount of binary data can be stored very simply. For example, consider four rows of numbers each containing only relative prime numbers and each corresponding to one of the four numbers of the matrix. A marker is moved to the location corresponding to value of the corresponding number in the matrix. Thus an arrangement similar to two billiards scorers can store a long binary sequence. Again a further compression is available by storing only 3 of the numbers and calculating the fourth therefrom.

A fourth such practical application is error detection/correction. Firstly, the reverse transformation can be checked for errors by reverse transforming the "received matrix" and reversing the decoded sequence order. The resulting sequence is then compared with the result of transposing the "received matrix" and then reverse transforming. If the comparison does not reveal two identical sequences, then an error in the reverse transformation has been detected. It is then necessary for the calculation to be repeated until the correct result is obtained.

Secondly, and of more importance, is error correction of the transmitted data.

For example, me matrix is reduced to just the three numerals 5, 4 and 7 and

transmitted along with the three numerals 5, 4, and 3 derived from the transposed matrix At the receiving end two matrices are generated from the six received

numbers.

If the 7 of the first transmitted matrix has been received as a 6, for example, this error can be immediately detected since the received number is not a relative prime. Thus the other received matrix is presumed to be correct.

Similarly, if the 7 of the first transmitted matrix has been received as a 5, for example, then the re-constituted matrices become

where a = (5 x 4 + 1)/5 = (20 + 1)/5 which does not give a relative prime and must therefore be wrong, whilst

A = (5 x 4 1)/3 = (20 + 1)/3 = 21/3 = 7

which is a relative prime and is therefore presumed to be correct.

Still further, if the 7 of the first transmitted matrix had been received as a 1, then the reconstituted first matrix would be

where a = (5 x 4 + 1)/1 = 21

However, here clearly the first and second received and re-constituted matrices are not transposes of each other and so there

must be an error.

This error can be corrected, and the corrected matrix identified in accordance with the following:

The received matrix can be regarded as the matrix where x and y are

unknowns. However, each entry in the matrix must be an integer. Furthermore, the cross product of the matrix entries must differ by one. That is 4 x 5 = 20 and so x.y = 20 +1 = 21. The factors of 21 and 1, 3, 7 and 21 and in general the number of possible factors will be small being less than one half of the square root of the product. Thus these four integers 1 3 5 and 21 are candidates for x and y. Therefore possible matrices are the pair and the pair

Of these four the first pair can be "eliminated" since on being decoded they each yield a sequence of nine binary digits whereas the second pair on being decoded each yield a sequence with the "correct" number of binary digits, namely five. In this connection, the correct number of digits in me sequence is generally known, for example as a result of organisation of transmitted data into "packets" of known length.

The error correction properties of the four numbers of encoding matrices are based on the fact that each relative prime pair of numbers contained carries almost as much information as the full matrix. A valid matrix contains relative prime

pairs a,c

b,d

d,c

b,a.

Each pair can be decoded in such a way that at most only leading or trailing repeated binary digits need to be estimated. The method of decoding these pairs differs only in (a) the rule for terminating the process, and (b) the interpretation to be given when both variables have the value 1.

(a) The termination condition for each pair can be found from the initial settings for encoding in which a= 1, b=0, c=0 and d= 1. Decoding of the pair a,c for example terminates when a= 1 and c=0.

(b) In order to terminate when one variable becomes 0, the decoding process must be such that both variables equal 1 at the step before. The interpretation to be given can be deduced by reference to the encoding process for the pair in question. Initially a=l and c=0. If a one is encoded first, a is set to a+c. Since c is zero no change occurs to a regardless of the number of leading ones encoded. When the first zero is encoded c is set to a+c= 1 +0=1 and thus a remains 1. Hence in decoding a,c the point (or position in the sequence) at which a=l and c=l is decoded as zero. It is also recognised that this zero may have been preceded by any number of ones including none.

As previously described the sequence 01101 when encoded in the order least significant bit to most significant bit results in the matrix so that a=3, b=5,

c=4 and d=7. For the pair a,c decoding now takes the form:

On the same principles the pair b,d terminates when b=0 and d= 1 and ignores leading zeros, equality is decoded as a one.

The pair d,c decodes in the same order as encoding, mat is the least significant bit first. Initially, a= 1 and c=0 and equality occurs for the first zero.

The pair b,a also decodes in the same order as encoding. The halting rule is b=0, a= 1 and equality occurs for the first 1 and leading O's are ignored.

The following table summarises the decodings from the four pairs for comparison. Underlining indicates the bits that are not in doubt in each case.

PAIR DECODED SEQUENCE

a,c 01101111......

b,d 01101000.....

d,c ....111101101

b,a .....000001101 These entries can be interpreted as follows:

For a,c the encoded sequence, read from left to right began 0110. If there were other digits they were all ones.

For b,c the encoded sequence read from left to right began 01101. If there were other digits they were all zeros.

For d,c the encoded sequence read right to left was 10110. If there were other digits tiiey were all ones.

For b,a the encoded sequence read from right to left began 1011. If there were other digits tiiey were all zeros.

It is evidence that any two pairs can be used to settle the correct encoding. If the pairs a,c and d,c or the pairs b,d and b,a are chosen then only three of the four numbers available are involved in deciding the correct decoding. Moreover, if as is generally the case, information about the length of the encoded sequence is available to the "receiver", any single pair of numbers provide a complete decoding.

As a consequence of the above, a further method of length dependent encoding

(ie. where both the sender and receiver both know the length of the sequence being transmitted, say 256 bits, but the receiver does not know the content) is available.

In this method, if bodi the and matrices are each separately used as

the start matrix, then leading zeros and ones respectively are ignored. This results in two decoded results which can be matched to correct any error.

The "rule" for this length dependent encoding (with x and y being the working values) is as follows:

Step 1 Set x=0, y= 1

Step 2 If the input bit is zero set y =y +x

If the input bit is one set x=y +x

Step 3 If the last input bit has been reached, emit me present values of x and y and tiien end, oflierwise go to step 1 and repeat for the next bit.

Similarly, the "rule" for decoding is as follows:

Step 1 Set i equal to the length of the sequence (eg. i= 156)

Step 2 If x≤ y emit 1 and set x=x-y

If x < y emit 0 and set y=y-x

Step 3 Set i=i-1,

if i is now 0 then halt, otherwise go to step 2/

If the initial setting is x=1 and y=0 tiien equality is interpreted as "emit 0" .

Further error correction can be obtained by consideration of me parity of the matrix entries. Random sampling from valid 2 x 2 matrices with three numbers recorded and say the bottom right entry to be reconstructed is from a population in which the other entries have expected values that have odd parity twice as often as they have even parity. This type of probability information enhances error detection and correction and is independent of the lengdi of encoded sequences. There are 16 possible arrangements of odd and even entries for 2x2 matrices, only six of these can occur in a valid matrix. The full parity restrictions are indicated in me following table for valid

^{'}

matrices

Thus if the equivalent parity of the received matrix is calculated, if me parity does not fall within this class, there must be an error.

Further the most likely error is alteration of a single bit, and thus the most likely number is the number to either side of the incorrect received number.

Thus, if the received matrix is then the parity is O O O O which is "all

odd" and does not occur in the table above. Hence, there is an error. The most likely correction is a single parity change. If a single change is required the possible candidates are the matrices corresponding to lines, 1, 2 3 and 6 of me above table. Second Algorithm

The algorithm of the present invention is not confined to the above example. In a second example a column matrix or vector in the form is used for the encoding

with the following rules:

(1) If me bit to be encoded is a zero, add the two numbers in the matrix to each other, leaving the upper number as it is, and replacing the lower number with the sum, and

(2) If me list to be encoded is a one, add the two numbers in the matrix to each other, leaving the lower number as it is, and replacing the upper number with the sum. Again the starting matrix is the unity matrix

Thus to encode the sequence 0101 gives

Thus the sequence 0101 is represented by the matrix or vector (5,8). This represents a unique mapping of the sequence into a two dimensional space. Further, an arbitrarily long binary sequence can be represented by two numbers of arbitrary magnitude, again order being important in the representation.

To decode, the rule can be expressed as follows:

(3) If the upper number is larger man the lower number, "emit" a one and replace the upper number with the difference of the two numbers whilst leaving the lower number as is, and

(4) If me lower number is larger than the upper number, "emit" a zero and replace me lower number wim the difference of the two numbers whilst leaving me upper number as is. The decoding ceases when the unitary matrix is reached.

To decode the above example,

Again, the decoded sequence is "emitted" in reverse order.

Third Algorithm

In a third example, it is not necessary to use addition, another binary operation such as multiplication can suffice. If a column matrix or vector in the form of is

again used, men the following encoding rules can apply:- (a) If the bit to be encoded is a zero, multiply me two numbers together, leaving the upper number unchanged, and replacing me lower number with the product, and(b) If the bit to be encoded is a one, multiply me two numbers togemer, leaving the lower number unchanged, and replacing the upper number with the product. Since 1 x 1 = 1, and thus 1 will not give a product different from itself when multiplied by itself, the starting matrix is

Thus to encode the sequence 0101 gives

Thus the sequence 0101 is represented by the matrix or vector (32,256). This again represents a unique mapping of the sequence into a two dimensional space. It will be observed that taking the logarith m to base 2 of the vector of this example (at any stage) results in the vector of the previous example. That is log2 32 = 5 and log_{2} 256 = 8.

In order to decode, the rule can be expressed as follows:

(c) If the upper number is larger than the lower number, "emit" a one and divide me smaller number into the larger number, leaving me smaller number unchanged, and replacing the larger number with the result of the division, and

(d) If the lower number is larger than the upper number, "emit" a zero and divide the smaller number into me larger number, leaving the smaller number unchanged and replacing me larger number with the result of the division. The decoding ceases when me starting matrix is reached.

The above examples result in the original information bearing sequence being compressed by being mapped uniquely into a two dimensional space. The mapping is such that error detection and/or correction is possible since the vector components (ie. the two numbers) must be relative primes.

Any sequence of positive whole numbers can be similarly faithfully encoded and decoded. The modified statement of me algorithm to accept a sequence of nonzero positive integers n_{1}, n_{2}...n_{i}..., indexed bjM is as follows:

Step 1. Set a 2x2 matrix to the values of me identity matrix.

ie.

Step 2. If i is odd men

Set a=a+(n_{i}xc)

Set b=b+(n_{i}xd)

If i is even th en

Set c=c+(n_{i}xa)

Set d=c+(n_{i}Xd)

Step 3. If the last input is reached

emit the present values of a, b, c, d

otiierwise goto Step 2 and repeat for the next number.

If it is required to decode in me original order tiien

Set x=b+c

Set y=a+c. If it is required to decode in reverse order then

Set x=a+b

Set y=c=d.

To decode

Step 1. Accept x and y

Step 2. If x >y then

determine the least number n such that x-(nxy)

is less than or equal to y

Emit n.

If y > x then

determine the least number n such that y-(nxy)

is less than or equal to x

Emit n.

If x=y then END

goto Step 1.

If it is only required to decode in reverse order the matrix of the encoding procedure need only be the 2x1 matrix initially set as The encoding

algorithm given above is then followed save that references to b or d are ignored. Data Compression

However, if this capability is prepared to be lost, a still further compression is available by means of what is conveniently referred to as a Z transformation. In this connection, reference is made to the table of Fig. 5 which is able to be conveniently stored in the form of a ROM or look-up table in modern digital processing equipment.

The table of Fig. 5 is formed as a number of rows and columns. At the start of each row is an integer (which conveniently represents the first of the two vector components) and at the head of each column is an integer (which conveniently represents the second of the two vector components). Thus each entry in the table is a potential representation of the two dimensional space into which the sequence is mapped.

However, since the numbers of the two dimensional space must be relative primes, it is not possible for a sequence to be mapped into some entries in the table such as (2,4) (6,2) and in general (n,m) where both n and m are divisible by a factor greater than one. These entries are therefore marked with an asterisk *. The remaining entries are then numbered 1, 2, 3 ... and so on, so that the vector components (5,2) are represented by the number 13 whilst the vector components (4,5) are represented by the number 25, etc. It is thought from number theory that the remaining entries constitute about 61.803% of the total number of squares. That is about 40% of the squares will carry asterisks.

Using the table of Fig. 5, it will be seen that the abovementioned sequence 0101 which can be represented as (5,8) can be further compressed and represented by the single number 53.

In addition, this Z transformation can be undertaken repeatedly. So to return to the first example of the sequence 01101, the matrix can be regarded as two

pairs of numbers, say, (3,5) and (4,7). If each of tiiese pairs is Z transformed according to the table of Fig. 5 then the result is two numbers 20 and 25.

If a second table, similar to that of Fig. 5, but omitting the asterisks and consecutively numbering each entry in the second table is prepared, tiien this second table can be used for a further Z transformation. This transforms the two numbers (20,25) into a single number being the number allocated to mat particular entry in the second table.

It follows from the above that a sequence of numbers can be transformed, via any one of a large number of various routes, into a single number. The larger the number of members of the sequence, the larger the magnitude of the single number. However, the number of possible routes (each with a possible different mathematical procedure) and the number of possible Z transformations (given that the allocation of numbers to tiie table entries is arbitrary), makes "code breaking" extremely difficult. An estimate of this difficulty is given in the Appendix. Indeed, the arbitrary nature of the tabulation of the Z transformation is equivalent to a "one time pad".

It also follows from the above that the term "matrix" as used herein refers not only to conventional row, column or row and column matrices but also to vectors having components able to be represented by entries in such matrices.

Another advantage which is able to be obtained is the ability to avoid transmitting the entire magnitude of the encoded numerical result. For example, if the digital sequence to be encoded and transmitted is broken down into "packets" each of which is, say, 256 bits long, then a number of such 256 bit sequences will need to be transmitted.

Because the data to be transmitted will not normally be all zeros or all ones, the actual numerical result of the transformation will normally lie within a small range of the potential range of results. If the centre of this small range is known to both the transmitting and receiving party, then only the difference between the result and the centre of the range need be transmitted. This stratagem can substantially reduce the magnitude of the number(s) to be transmitted. For matrices of determinant 1 as has been described three numbers

only need be known. It is possible to considerably reduce the magnitude of one of these tiiree numbers thus enhancing the application of this transformation for data compression. One way of achieving this reduction relies on the observation that cb=ad-l and each entry must be a whole number.

If for example b is not to be used then c can be replaced by a number indicating which of the possible factorisations of ad-1 into whole numbers corresponds to b and c.

For example is such a matrix with a=20, b=501, c = 19 and

d=476.

As 20x476=9520 the product of b and c must be 9519. The only

factorisations of 9519 are as shown below.

The number 3 can now be used to indicate the factorisation with c= 19 and b=501 and the matrix can be reconstructed from the number 20, 476, 3.

It can be shown from number theory that the magnitude of the number indicating a factorisation must in general be considerably less than the one half of the square root of the product of a and d. This matrix encodes 45 binary digits whereas the numbers 20, 476, 3 require 5, 9 and 2 binary digits to represent them - a total of 16.

Further reduction in the magnitude of the numbers can be achieved if the number that is omitted is always the largest. To achieve this information must be transmitted or stored indicating which has been eliminated. The cost of so doing is two binary digits with an arrangement such as 00 indicates a, 01 indicated b, 10 indicated c and 11 indicates d. It is a property of these matrices that the least entry is always diagonally opposite the greatest. This fact can be exploited by subtracting this least number from the other two being transmitted with the result remaining a positive number. It is a further property of these matrices that the numbers that remain after this operation are three numbers from which a valid matrix can be constructed. As a result the process described above for replacing one number by a number indicating a factorisation can be now applied.

For example, the matrix has the greatest entry in the 'c' position

and diagonally opposite in the 'b' position is the least value 57. Using * for the value not recorded the matrix becomes

Subtracting the least entry b=57 from the remaining leaves

To test that this is a valid matrix, check tiiat (23x62)-1 can be evenly divided by 57; (23x62)-1 = 1425 and 1425 divided by 57 is 25 exactly, hence a=23, b=57, d=62 are entries from a valid matrix in which c=24. The factorisations of 1425 are:

As b=57 occurs in the seventh factorisation the three numbers 23, 7, 62 and the two binary digits 10 are sufficient for the reconstruction of

The property that the entries in the matrices are relative prime pairs can also be used to botii eliminate various possible matrices since, as indicated in Table 3.1, not all possible matrices are valid. Further, if an arbitrary allocation of numbers is made to the matrices of Table 3.1 a transformation similar to the above described Z

transformation, leading to still further data compression, can be achieved by the processes that will be termed arithmetic transforms.

Arithmetic Transforms

The conversion yields relative prime integer pairs. The density of relative prime pairs is 6/π^{2}, ·6079... . The pairs a.b with (a,b) > 1 that occupy the remaining places in an enumeration are exploited as follows.
Accept a,b positive integers with (a,b) = l.

Assume, without loss, a<b.

Set d=b-c.

Transform a to a' where a' is the ordinal position of a in the sequence of integers prime to d.

Set b' = a' +d.

The transform is invertible as d can be recovered from a', b' and the a'^{th} integer prime to d is readily computed as follows. Given a and d, a' is computed from a vector g_{1}...g_{r} where g_{i} is a prime dividing d.

Full factorisation of d is not a practical procedure nor is it required. The contribution to numerical shrinkage that the i^{th} prime can make is

This ratio rapidly approaches 1 for i increasing. For i≥26 the contribution of prime divisors is less tiian 1 % and by i= 169 is less than .01 % . A numerical change of 50% is required to reduce the representation by 1 bit. Testing for divisibility by the first 63 primes is sufficient for an applied algoritiim.

The transform so modified sets a' to be the ordinal position of a in the integers prime to the product of prime divisors (without repetition) of d that are less than or equal to the r* prime. If no prime divisors≤_{pr} are found d is treated as if it were prime. Call this transform T_{1}.

The underlying result that establishes isomorphism between finite binary sequences and relative prime pairs enables the mapping of an integer L(a,b), the length of the sequence represented, to each such pair.

If sequence length is known a transform can be applied which sets a' to be the ordinal position of a in the sequence of integers prime to d and of length L(a,b).

Call this transform T_{2}.

Given the result of T_{1} the second transform can be readily computed by the application of identities that relate sequence length to the numerical representation.

A third transform T_{3} is achieved by setting T_{3}(a,b) to be the a'^{th} relative prime pair in the enumeration of pairs prime to d, where a' is value derived from T_{2}. This variation preserves the capacity to later convert to a unique rational ratio.

When a representation for data blocks of fixed length is required a strategy of ignoring leading or trailing repeated bits can be applied. The appropriate transform T_{4} then sets a' to be the ordinal position of a in the integers prime to d with length less than or equal to L(a,b).
The following table illustrates the effects of these transforms on relative prime pairs with difference 34.

L(a,b) denotes the length of the sequence encoded by a,b.

Furthermore, as indicated in Table 3.2 the products of matrices can be similarly mapped. As the products are formed from two prime matrices, they are equivalent to binary sequences, that is sequences constructed from just two symbols (eg. 0 and 1).

The mathematical paper comprising the Appendix hereto discloses a general treatment. This general treatment deals with the construction of sets of functions that can be composed with each other so as to produce other functions such that no matter what finite number of composites are taken, the original sequence and order in which the composition was formed can always be faithfully recovered. In some cases these functions themselves have a simple numerical representation in which case the numerical representation of the result of composition can be used as a representation of data.

Alternatively, the resultant function can be used to operate on some other mathematical structure which is representable by objects over some number system.

In other cases the functions that are composed do not have particularly convenient numerical forms, however their action on other numerically based mathematical structures result in numerical encoding with the desired properties.

Non-linear Operations

The general treatment in the Appendix is not restricted to linear operations. However, non-linear operations are not always conveniently expressed in matrix form.

The following example involves both linear and non-linear operations and the form of expression chosen is a statement of rules for the performance of operations (ie. an algorithm). An advantage of this form of expression is that the effect of several operations (linear or non-linear) can be expressed in the statement of a further algorithm. Step E1

Associate with each symbol of the alphabet a numerical variable, eg. if A is in the alphabet q(A) is the variable associated with A.

Step E2

Allocate an order to the variables so that each variable has a separate position number, eg. for an alphabet of three symbols A,B,C six orders are possible.

Step E3

Set the value of the variable of the first symbol to 2 and all other variables to 1.

Step E4

E4.1 consider the next symbol to be encoded and determine the largest value among the variables otiier than the one associated with the variable being considered - call this largest value D.

E4.2 denote by P the highest position number among the variables with the value D, other than the one associated with the variable being considered.

E4.3 set the value of the symbol being considered to the sum of its present value and D.

E4.4 if the position number of the variable being considered is greater than P, then reduce the value set in step E4.3 by 1.

Step E5

If the symbol being considered is the last in the sequence to be encoded then the process terminates, otherwise go to (or repeat) Step E4.

For example, encoding the sequence ABCAABC requires three variables q(A), q(B), q(C). The chosen order for encoding in this example is the first order of the above table, that is the position of q(A) = 1, the position of q(B) = 2, and the position of q(C) = 3.

For the first letter of the sequence A, in accordance with Step E3, set the first symbol q(A) to 2 and the remaining variables or symbols q(B) and q(C) to 1. This creates the first "line" in an encoding table as follows:

The steps carried out can be set out as follows: E1 Assign

q(A) to the letter A

q(b) to the letter B

q(C) to the letter C

E2 Assign

position number 1 to q(A)

position number 2 to q(B)

position number e to q(C)

E3 First symbol A

set q(A)=2,

set q(B)= 1,

set q(C)= 1.

___________________________________________________________________________

Summary -

Symbol encoded A, present values q(A)=2, q(B)= 1, q(C)= 1

___________________________________________________________________________

Now go onto step E4 by considering the next symbol of the sequence which is the letter B

E4 Next symbol B

E4.1 Variables other than q(B) are q(A) = 2 and q(C)= 1, q(A) > q(C), hence D=2.

E4.2 Only q(A) is equal to D. Position number of q(A) is 1, hence

P=1.

E4.3 B is assigned to q(B).

The present value of q(B) is 1.

Set q(B) to 1 +D = 1 +2 = 3.

E4.4 Position number of q(B) is 2 and P=1 so the position number of symbol being considered is greater than P. Reducing the value of q(B) set in 4.3 by 1 gives q(B)=2.

___________________________________________________________________________

Summary

Symbol encoded B, present values q(A)=2, q(B)=2, q(C)= 1

___________________________________________________________________________

Thus the second "line" in the encoding table is as follows:

* where 2=2+1-1 The process is repeated so as to build up the full encoding table.

E5 B is not the last symbol, go to step E4.

E4 Next symbol is C

E4.1 Variables other than q(C) are q(A)=2 and q(B)=2 hence D=2. E4.2 Both q(A) and q(B) equal D. Position number of q(A) is 1 and of q(B) is 2, hence 2 is the higher position number among the variables other than q(C) so set P=2.

E4.3 C is assigned to q(C) which has present value 1.

Set q(C) = 1 +D = 1 +2 = 3.

E4.4 Position number of q(C) is 3 which is greater than P of E4.2 so reduce the value of q(C) set in E4.3 by 1.

q(C) = 3-1 = 2.

___________________________________________________________________________

Summary - Symbol encoded C, present values q(A)=2, q(B)=2, q(C)=2

___________________________________________________________________________

E5 C is not the last symbol, go to step E4.

E4 Next symbol A.

E4.1 Variables other than q(A) are q(B)=2 and q(C)=2 hence D=2. E4.2 Both q(B) and q(C)=2.

The position number of q(C)=3 is greater than the position number of q(B) hence P=3.

E4.4 The position number of q(A) is 1 which is less than P so q(A) is not reduced by 1.

___________________________________________________________________________

Summary -

Symbol encoded A, present values q(A)=4, q(B)=2, q(C)=2

___________________________________________________________________________

E5 A is not the last symbol, go to step E4.

E4.1 Variables other than q(A) are q(B)=2 and q(C)=2,

2 is the greatest value, set D=2

E4.2 Both q(B) and q(C) equal D, however, the position number of q(C) is greater than the position number of q(B) so P is set to the position number of q(C) hence P=3.

E4.3 The present value of q(A) is 4.

Set q(A) to 4+d = 4+2 = 6.

E4.4 The position number of q(A) is 1 which is less than P currently set to 3 so q(A) is not reduced by 1. ___________________________________________________________________________

Summary -

Symbol encoded A, present values q(A)=6, q(B)=2, q(C)=2

___________________________________________________________________________

Thus the full encoding table is as follows:

The output from the algorithm is a numerical representation for the input sequence of letters in this example, the numbers 6,7,8 in this order represent the sequence ABCAABC.

In order to decode the decoder must know - 1. That the numbers 6,7,8 means q(A)=6, q(B)=7, q(C)=8.

2. The chosen order for encoding - in this case

position of q(A)=1

position of q(B)=2

position of q(C)=3.

The knowledge required for decoding can be considered as the key for an encrypted message.

Numerical representations are decoded by the following procedure. Consider the numerical representation and, Step D1.

Determine the variable with the largest value.

If more than one variable has the largest number select from these the one with highest position number - call this value R.

Step D2.

Emit the symbol associated with the variable determined in step Dl.

Step D3.

Determine the highest value among variables other than the variable determined in step D1. - call this value D.

Step D4.

Determine the highest position number of variables with the value of D from step D3, excluding the variable determined in step Dl. Call mis position number P. Step D5.

Set the value in me variable from step D1 to R-D.

Step D6.

If the position number of the variable from step D1 is greater than P then increase its value by 1.

Step D7.

If one value is 2 and all other values are 1 then terminate, otherwise go to step D1.

Decoding the present example proceeds as follows:

In this example the variable to be increased is determined only by the symbol to be encoded. Another method is to make this determination dependent on botii the next symbol to be encoded and the variable determined for the symbol before it. Such systems are termed "transition or finite state systems" and have a fundamental role in many applied sciences including electronic engineering (see "Switching Theory, Volume 2: Sequential Circuits and Machines", Raymond E Miller, (1965) Published by John Wiley & Sons, Inc.) and linguistics (see "Formal Languages and their Relation to Automata", John E Hopcroft and Jeffrey D Ullman, (1969) Published by Addison- Wesley Publishing Company, Inc.). The development in linguistics has application in digital processing of natural and formal languages (see "Text Compression" , Timothy C Bell, et al; (1990) Published by Prentice-Hall Inc.). By associating variables with states these procedures give a numerical representation of the behaviour of finite state devices.

The application of the above to the representation of data and text is achieved as follows:

Variables are assigned numbers as they are not assigned directly to letters of an alphabet. There must be at least as many variables as alphabet symbols if all possible sequences are to be represented, however any countable number of variables can be employed. Some applications do not require all possible sequences to be represented in which case it is possible to find a representation with less variables than symbols.

As for the previous example variables are assigned an order. In the example that follows the "natural" order is chosen ie. position of q1 is 1, q2 is 2

eg. if the last variable increased was q2 and the next symbol is A then the next variable to be increased is q4.

An "initial" or "start" variable must be nominated. Different numerical representations arise from each possible start state, eg. encoding ABCA.

Thus the same sequence ABCA can be encoded either as 7132 or as 2313. Decoding requires a key that is knowledge of

1. The order assigned to variables.

2. The transition table.

3. The start state.

From the transition table a table is constructed from which can be read the output symbol for each possible transition between variables. For the present example the table is:

With due change of detail decoding follows the procedure D1 to D5.

Another direction of generalisation allows each numerical variable to be replaced by a collection of variables, termed sub variables. When executing an encoding algoritiim at the stage at which the value in one variable is to be added to another, a single subvariable of one is added to a single subvariable of the other.

Rules for selecting the subvariables can take a variety of forms, two are described below.

Selection Rules.

Example 1. Unordered subvariables.

This selection rule is based on the magnitude of subvariables only and does not require their order to be maintained. All subvariables are initially set to 1.

(a) The magnitude of a variable is taken to be the greatest value among its subvariables.

(b) The sum required by E 4.3 is performed as follows - Select any subvariable with a value equal to the least value among the subvariables being considered and set its value to the sum of its present value and D. The value D having been determined with the rule (a) as above.

To encode the sequence 10010 proceed as follows:

10010

({1 1 1 1 },{1 1 1})0

({2 1 1 1 },{1 1 1})0

({2 1 1 1 },{2 1 1})0

({2 1 1 1 },{2 2 1})1

({2 3 1 1 },{2 2 1})0

({2 3 1 1 },{2 2 3})

Decoding proceed; as follows:

({2 3 1 1 },{2 2 3})0

({2 3 1 1 },{2 2 1})1

({2 1 1 1 },{2 2 1})0

({2 1 1 1 },{1 2 1})0

({2 1 1 1 },{1 1 i})i

({1 1 1 1 },{1 1 1})

01001^{R}= 10010
As the order among subvariables need not be known to decode tiiey can be arranged in ascending order of magnitude which allows them to be efficiently stored or transmitted as successive differences,

eg. 6,6,8,12,17 becomes 6, (6-6)_{0}, (8-6)_{2}, (12-8)_{4}, (17-12)_{5} and reconstructed by 6, (6+0)_{6}, (6+2)_{8}, (8+4)_{12}, (5 + 12)_{17}.

Example 2. Ordered subvariables.

If subvariables as well as variables are given a position number, then the sum required by Step E4.3 can be determined by specifying the order in which subvariables are used. The order can be specified by a sequence of any lengtii. For example assume that a variable q(x) is assigned subvariables then the sequence

..is interpreted as follows.

The first time q(x) is added to take the sum with subvariable 3, the second time with subvariable 2, and so on. A different use order can be assigned to each variable.

Specification of order of use must be known to decode the resultant numerical representation and becomes a component of the key for encryption.

The following modification to the algorithms given for encoding sequences of positive numbers is appropriate for alphabets of any finite size. It is not linear.

Assign to each letter of an alphabet comprising s symbols a distinct integer in the range 1 to s.

If a and b are integers denote by a\b the greatest integer less than or equal to the result of dividing a by b,

examples 7\2=3, 7\8=0, 7\1 =7

Replace the symbols in a sequence by the integers assigned to them thus forming a sequence of integers n_{1}, n_{2}...n_{i}... indexed by i.

Step 1. Set a 2x1 matrix to the value

Step 2. If i is odd tiien

Set D=b-b\s

Set a=a+(njxD)

If i is even then

Set D=a-a\s

Set b=b+(n_{i}xD)

Step 3. If the last integer is reached

emit the present value of a and b

otherwise goto Step 2

and repeat for the nex t number. To decode

Step 1. Accept a and b

Step 2. If a>b then

Set D=b\s

determine the least number n such that

a-(nxD) is less than or equal to b

Emit n

If b> a then

Set D=a\s

determine the least number n such that

b-(nxD) is less than or equal to b

Emit n

If a=b then end.

Step 3. Goto Step 2.

The procedures described involve only the operations of addition and subtraction of whole numbers (integers) and the determination of greatest and least values among a collection of integers.

The formulation provided in the appended paper is far more general. There is a technical (ie. mathematical) sense in which the examples based on integers, addition, subtraction and multiplication provide a generic model for most of the cases covered by the general treatment appended.

Most of the general treatment is homomorphic to the integer case meaning that rules can be found (in principle) which allow them to be recognised as being of the same general form. An elementary example is provided by multiplication of numbers less than 1.

One of the additive encoding of 011 can be displayed by

(1,1)0(1,2)1(3,2)1(5,2) where the left is added to the right for 0 and add right to left for 1.

If the initial (1,1) is replaced by (·1 ,·1) and multiply the right by the left for 0 and multiply left by the right for 1, the encoding becomes - (·1,·1)0(·1,·01)1(·001, ·01)1(·0 0001, ·01).

Decoding involves finding the numerically smaller of a pair then dividing. The relation between the two is revealed if the entries in the second case are replaced by the number of decimal places eg. ·01 has two decimal places so it is replaced by a 2. If this is done the additive case results

Knowledge about which operations have been used and the initial values chosen become anotiier component of a key for encryption and decryption.It is not the case tiiat the present invention is restricted to binary. The following tables set out encodings of hexadecimal and the ordinary English alphabets.

In the above, the hyphen (-) represents a blank space.

Furthermore, the encoding can be used to produce very great savings in the representation of text, instructions, warning messages and the like. This is particularly useful for small items such as resistors which can be provided with part numbers represented, say as machine readable bar codes. Alternatively, a warning message such as DO NOT INCINERATE can be encoded and decoded in accordance with the following

DO NOT INCINERATE

As an example of a bar code produced in accordance with the above, in Fig. 5 is reproduced a bar code which represents the number 2 4 6 22 26 14 8 47 22 and 22.

It will be immediately apparent that if there is an agreed method of

representing the text of standard warnings so that, for example

* represents STERILE PROVIDED PACKAGE IS INTACT AND

UNOPENED

@ represents ALLERGIC TO PENICILLIN

then large volumes of text of standard messages can be condensed into, say, bar codes which are both machine readable (as at present) but also then decoded by the machine. As a consequence, "wiping" the bar code with a light pen would result in the screen of the machine displaying the full text of the decoded message.

Egyptian Fractions

The transformation from a 2x2 matrix to a 2x1 matrix, equivalent to a rational number, can be used to explicate consideration of the decomposition of rationals into partial fractions. The relationship with decomposition into distinct Egyptian (unit) fractions is immediate since in order tiiat such a process terminates the final step is a subtraction of the form a/c - b/d constrained by the requirement that ad-bc=1. Because ad-bc = 1 is anotiier way of representing the matrix 1

The following technique is essentially equivalent to a matrix technique but expressed in different notation.

This procedure generalises to enable encoding sequences over k different symbols as follows:

Step 1. Assign to each of the k symbols a distinct integer in the range 0 to k-1. For example over 7 symbols 1,0,6,5,3,1,2 would be a possible sequence.

Assign to the sequence a pair of integers u

b=k^{v}

where

a and b divided by (a,b).

This expression is equivalent to forming the sums of the form

c_{1}/k^{1} + c_{2}/k^{2}...c_{n}/k^{n}.

For example the sequence 1,2,0,5,3 witii k=6 of length n=5 is assigned the partial fraction _

The rational number is "decoded" as follows

Step 1. Accept a,b

Set n= 1

Step 2. Set m=k-1
Step 3. If m/k^{n} is less than or equal to the rational number a/b then

Emit m

Set n=n+ 1

Goto Step 2

Else if m > 1

Set m=m-1, repeat Step 3

Else if m= 1

Emit 0

Set n=n+1

If n = string length then end

Else goto Step 2.

Miscellaneous

Pairs of relative prime numbers form the mathematical system of rational numbers (fractions) conventionally denoted by Q. The elements of Q^{+} are the positive rational numbers. Rational numbers have a variety of representations.

The following three methods of representing rational numbers are those treated in the standard references "Handbook of Mathematical Function", Ed. Milton

Abramowitz and Irene A. Stegun, (Dover Publications, Inc.; New York, 9th Ed, 1970.) "Handbook of Mathematical, Scientific, and Engineering Formulas, Tables, Functions, Graphs, Transforms", Staff of Research and Education Association, Dr. M. Fogiel, Director, (REA 61 Ethel Rd West, Piscataway, New Jersey 1989).

(1) A pair of positive integers eg. 3,7.

(2) As a decimal number that either terminates or (after a certain stage) recurs.

The "decimal" for a rational number may terminate in one scale and recur in another. For example 1/7=0.142857' in the scale of 10 and -1 in the scale of 7.

(3) A regular continued fraction

which is completely described by the sequence a_{0}, a_{1}, a_{2}, a_{3}..... .

Unlike decimal or place notation in 2 above, the continued fraction

representing a rational number always terminates, and is independent of the number
scale (except for the choice of scale to represent the integers in the sequence a_{0}, a_{1}...a_{n}).

For example 27/38 has a decimal place notation of 0-7105263157894736842 (scale 10) and the sequence 1,2,2,5 represents the continued fraction

which is equivalent to 27/38.

The following table shows the place representation in scales 2 to 9.

SCALE 27/38

2:0·1011010111100101000

3:0·201011222021211000

4:0·2311321100

5:0·323401441

6:0·4132501521

7:0·465

8:0·5536241

9:0·634867730

It is evident that the numerical efficiency of place notation is variable. The number of places required depends on the factor structure of the scale in relation to the denominator and tiie exponent to which the scale belongs mod the denominator.

For the purposes of data processing and storage numerically efficient representation of rational numbers that would be used to represent data sequences become of central commercial importance. The patent application deals with several efficient alternatives developed by the applicant discussed under the heading

"Compression".

The mathematical results appended show that finite sequences of binary digits are related to positive rational numbers in the following way.

1. Every finite sequence of binary digits is represented by a unique positive rational number.

2. Every positive rational number is related to (determined by) a unique finite binary sequence.

3. The binary sequence of particular rational numbers can be calculated by applying an algorithm which involves only comparison and subtraction as follows, Step 1. Accept a and b

set X=a. Y=b

Step 2. If X > Y emit 0 set X=X-Y

If Y >X emit 1 set Y=Y-X

Step 3. If X=Y END else

goto step 2.

4. The positive rational numbers associated with a particular binary sequence can be calculated by applying an algorithm that involves discrimination between input values and addition only.

Step 1. Set X to any non-zero value.

Set Y=X.

Step 2. Input binary digits.

Step 3. If input=0 set Y=Y+X

If input= 1 set X=X+Y.

Step 4. If last input then end and output X,Y

else goto step 2.

5. The relationship between the positive rational number of a binary sequence and positive rational number in the reverse order is X' = [X]_{y} ^{- 1}

Y' =[Y]_{x} ^{-1}

6. Every positive rational number is uniquely represented by a 2x2 positive integer matrix of determinant 1 - hereafter called a 'valid matrix'.

7. Every valid matrix is uniquely represented by a positive rational number.

8. The positive rational number associated with a valid matrix can be calculated as explained above.

9. The valid matrix associated with a positive rational number can be determined as explained above.

10. The results 1-9 above allow finite binary sequences, positive rational numbers and valid matrices to be used interchangeably. Some consequence of this

interchangeability are:

A. Since positive rational numbers, finite binary sequences and valid matrices are interchangeable, an operation on one of these three has a corresponding operation for each of the other two. Thus multiplying two valid matrices "joins" or concatenates the corresponding binary sequences. The corresponding operation on rational numbers is not yet named in mathematical literature (but understood and well defined). Similarly, the addition of rational numbers produces a combination of binary sequences for which no name has yet been coined.

Further, inverting a rational number (so that a/b becomes b/a) corresponds to taking the complement of a binary sequence - a process basic to digital electronics and especially logic circuits. The corresponding action on a valid matrix is to generate a transpose about the minor axis.

That is, is transposed to

B. The encoding into two numbers and the encoding into four numbers result in number pairs that have no factors in common other than 1 , that is they are relative prime pairs. Another interpretation of such pairs is that they represent the ratio of two magnitudes. From this point of view the particular numbers chosen to represent the ratio is not important.

For example, ·5/· 8, 5000/8000, 25/40, 5√2√8√2, 300/480 all represent the same ratio and the same rational number and the algorithm given at 3 above will yield the same binary sequence regardless of which of these pairs is used. In some practical affairs the difference in magnitude is of the essence, for example the instruction 'mix 25 bags of cement with 40 bags of sand' cannot be replaced with 'mix 5 bags of cement with 8 bags of sand' . The ratio of sand to cement is maintained, however, the total volume of the mix is not.

However, in some applications of the present invention only the ratio is relevant. The decimal notation for the number pairs above is the same for all, that is ·625. The result of dividing 5 by 8, 5000 by 8000, 25 by 40, 5√2 by 8√2, 300 by 480 is exactly ·625 in each case.

Thus, the algorithm given in 3 above can be modified at Step 1 so that we set

X= 1, and set Y=th e decimal representation of a/b. Thereafter the algorithm proceeds as given above.

C. A pair of relative prime integers can always be recovered from the decimal form of a rational number even when repeating digits are involved. For example x= ·37° = ·373737 ............ multiplying both sides by 100 yields 100x=37·37° =37+x that is 100x=37+x from which follows 99x=37 thus x=37/99. The decimal notation for rational numbers always has a terminating decimal part or a sequence of repeating digits. This fact is used with new theorems in the mathematical paper annexed hereto to show that non-rational decimal numbers (eg.√2, π) uniquely represent distinct infinite binary sequences.

This property can be exploited as it is important in distinguishing various combinations of periodic waves (continuous periodic functions) for example the sinusoidal waves that characterise radio and electronic communication systems.

The ratio of parameters defining a sinusoidal wave can be exploited to transmit digitised information by the processes of this invention. The ratio of frequency and amplitude or phase and frequency or any of the other combinations can be set to transmit a rational ratio corresponding to the result of encoding a data sequence.

Propagation of waves with these characteristics allows the transmission of a ratio which on reception provide the information sufficient for the recovery of the encoded data sequence by the algorithms described above. It is also possible to combine several frequencies with the required ratio into a single complex wave form which can, by conventional methods, be later decomposed into the component frequencies from which the ratio is immediately available.

The pair X,Y that result from a two number encoding can be used as parameters in the equations describing these waveforms Y=αsin(fω+ν) where α is the parameter by which amplitude is determined and f is the parameter by which frequency is determined.

If two waves are generated with the form a_{1}sin( Xω+ν) and a_{2}sin(Yω +ν) then the wave generated by their sum, namely

a_{1}sin (Xω +ν)+a_{2}(Yω+ν)

is periodic with period 2π. If X and Y are not relative primes then the resultant wave will also be periodic witii period 2π/n where n is the highest common factor of X and Y.

This property allows information to be embedded within electromagnetic radiation and propagated witii that radiation, or to be embedded in other forms of sinusoidal electric signals.

The foregoing describes only some embodiments of the present invention, further disclosure of which is contained in the matiiematical paper forming the

Appendix hereto.

From these mathematical results it follows that:

1. Any physical system with a parameter which can take rational values, can be used to represent digital data whether as a storage medium in which the data is recorded or as a transmission medium in which the data is transferred from one place to another. If two parameters are employed, the representation is unaffected by uniform decay of their values. In this context uniform decay is independent of magnitude, for example the decay of charge on two identical capacitors.

2. Any physical system which has the properties of 1 above and can (a) be

induced to discriminate between two external conditions, or

(b) have two parameters that can be induced to vary in response to this discrimination as implied in the given algorithms, or any isomorphic operation thereof, is capable of encoding and decoding data. APPENDIX

Mathematical Paper Entitled:

THE NUMERICAL REPRESENTATION OF FINITELY GENERATED

SEMIGROUPS

to be offered for publication to the

BRITISH JOURNAL OF MATHEMATICS

or

THE JOURNAL OF THE ASSOCIATION FOR COMPUTING MACHINERY

PREFACE

The results reported here have arisen from the continued consideration of matters treated in the authors doctoral thesis (Masters , 1971 ) . In this work non standard representations of certain infinite subsets of finitely generated semigroups were considered , one aim being to find physi cal representat i ons sui table for empi ri cal neuropsychological research . In this process ubiquitous physical considerations , symmetries and relative magnitudes for example, play a role for which pure mathematical research has no motivation to consider .

The paral lel investigation of transformations and invariants required for the case of patterned physical stimul i , and number theoretic considerations , in the case of absolute and relative changes in stimul i presented sequentially, has yielded a series of results on the algebraic, numerical and physical representation of phenomena for which semigroup theory is an appropriate model . Since this includes digitised data as one case in point , results have immediate practical and commercial signif icance .

At present digitised information is represented for storage, processing and transmission by a large unbounded number of variables which take a very smal l range of values (usually 0 and 1 ) .

This paper describes a representation which inverts this situation and digitised information is represented by a small , f ixed number of variables which can take a large range of values .

From a formal point of view the existing representation constitutes a subsystem of the new. In practice the new representation has a number of distinct advantages , and many processes diff icult in the old can be simply and elegantly achieved in the new .

Finitely generated free semigroups are the mathematical counterparts of digitalised information . If the digital isation recognise two values the appropriate semigroup is the free semigroup over two generators and in general a digitisation that recognises n values has associated a free semigroup over n generators .

In a manner that is preci sely paral lel , mathematical theorems recognise that the free semigroup over two symbols can be used to model those over any infinite number of generators , as engineers exploit binary representation to model all forms of digitised data.

The mathematical theory of semigroups has developed as an analytic tool in the study of data processing systems and the formal (programming ) languages designed for them.

In many branches of modern algebra a numerical representation of the structures used for the analysis of engineering systems has been achieved . When achieved numerical representations have themselves often been directly appl ied by engineers .

This has not been the case to date with semigroup theory .

This paper studies procedures for establ ishing numerical representations of f initely generated free semigroups derived from them . The representations discovered have a deep relationship with several powerful branches of mathematics , including number theory and linear algebra . So intimate is the relationship between digital data and these numerical representations that their practical consequences can be immediately grasped and indeed offer the basis for alternate technologies for the information industries and sciences .

INTRODUCTION

For semigroups S and T any morphism from S to T is a representation of S in T . A morphism from S to the full transformation semigroup on S of al l self maps under composition, exists for al l semigroups S . If S can be def ined from some generating set X a representation can be found in F_{x} the free semigroup on X comprising all f inite words over X under concatenation .

It is here proposed to seek the representation of the non commutative properties of concatenation of uninterpreted symbols with objects derived from the commutative number systems . The strategy is to exploit the fact that the semigroup of endomorphi sms of the di rect product of commutative semigroups under composition is not in general commutative .

Prel iminary results and the definitions and notation employed throughout are developed in Section I . Examples of the processes proposed are developed in Sections I I and I I I and extended to finitely presented Semigroups in Section IV. Section V is less formal and provides an example of appl ication to encryption .

SECTION I.

Throughout o is an associative operation and o is reserved for composition. The semigroup (S,o) is denoted by S when the product is understood. Products aob are abbreviated to ab.

Function composition is written left to right to be consistent with other products so αβ means first α then β.

As associativity allows products of any length to be written without the brackets, notation based on ll is used for extended products. If S has further structure by virtue of closure under another binary operation additive notation is employed where ambiguity is reduced and where convenient or usual. A semigroup with an identity element is a monoid. A semigroup S, if not a monoid, having adjoined a new symbol, 1, with the property that 1a=a1=a for all a in S0{1} is the monoid denoted by S^{1}. If S is a monoid S^{1}=S. Similarly S^{0} denotes the semigroup with an adjoined zero element 0, if S does not possess a zero; otherwise S^{0}=S. By convention, S must have at least two elements to possess a zero.

N is used for the natural numbers {1,2...} and w for No{0}={0,1,2...}.

A homomo r ph i sm α: S→T is a mapping from a semigroup S into a semigroup T such that (ab) α=aαbα for all a,b in S.

An anti-homomorphism α from a semigroup S to another semigroup T is a mapping which reverses multiplication in that (ab)α=bαaα(a,beS).

If X is a non-empty set, denote by F_{x} or F(X) the set of all non-empty finite words x_{1}x_{2}...x_{n} in the alphabet X under the operation of concatenation. The semigroup F_{x}, and an embedding of X into F_{x} where each xεX is the corresponding one-letter
word x in F_{x}, form the free semigroup on X. Any two free semigroups on base sets of equal order are isomorphic. The finitely generated free semigroup on a finite set of n elements, F_{n} is taken here to be particularised as the set of symbols 0,1,...,n-1 under finite concatenation with the embedding of symbols 0,1,..., n-1 into F_{n} as the words of unit length. For convenience the symbols will also be treated as the integers 0,1,..., n-1 without specifying the mapping by which this is achieved. Where it is required to distinguish the alphabet {0,1,...,n-1} from the semigroup F_{n}, L_{n} is used, so that F_{n}=F(L_{n}) in a common notation.

For a word x of a free semigroup written with exponents let spine of x or sp(x) denote the word derived from x by setting exponents greater than 1 to 1, and the exponents of x or ex(x) denote the m-tuple, m=|sp(x)|, with the i^{tb} component equal to the i_{th} exponent in x. For example sp(ab^{3}a^{2}c)=abac, ex(ab_{3}a^{2}c)=1,3,2,1 sp ( abac )= abac, ex(abac)=1,1,1,1.

If X is a set, a multi set chosen from X is a function m from X into N. For example {a,a,b,b,b,d} denotes the multiset with X={a,b,c,d,e} and m(a)=2, m(b)=3, m(c)=0, m(d)=1, m(e)=0.

Words in F_{n} can be viewed as walks in the complete graph K_{n} augmented so that each vertex has a single loop. Setting the vertex set to be L_{n} allows the edges to represent adjacent symbols in a free word, that is the edges are a subset of L_{n}xL_{n} derived by taking a,b to be the same edge as b,a of (n^{2}+n)/2 edges. If loops are omitted only the spine of a word can be represented.

As K_{n} has no parallel edges a walk can be specified by a vertex sequence, eg. (1,3)(3,3)(3,0)(0,2) is an unambiguous
representation of the word 13302.

If edge sequences are used to represent free words then the appropriate graph is the digraph derived from K_{n} by replacing each edge with a pair of parallel arcs of opposite orientation .

SECTION II.

Given a standard embedding of X into F_{x}, if X is a generating set for a semigroup S then for any map Φ:X→S and the map ψ:F_{x}→S defined by (x_{1}x_{2}...x_{n})ψ=x_{1}Φχ_{2}Φ...x_{n}Φ then S is isomorphic to F_{x}/ρ where ρ is the relation ψoψ^{-1}. If X is finite and p can be generated by a finite set R={(w_{1},z_{1}),..., (w_{m}, z_{mi})} of elements (w_{i},z_{i})εF_{x}xF_{x}, F_{x}/ρ is finitely presented and

F_{x}/ρ=(x_{1},x_{2}...X_{n}|w_{1}=z_{1}...w_{m}=z_{m}), that is F_{x}/ρ has generators x_{1},x_{2}...x_{n} and relations w_{1}=z_{1}...w_{m}=z_{m}. A semigroup presentation with generating set X and relation set R is denoted by (X;R), and the corresponding semigroup by (X;R). Adjoining the empty string A to F_{x} provides F_{x} ^{1}, the free monoid on X. Monoid presentations include relations of the form w=1.

If X is a countable set and x,yεF(X) then S is said to satisfy x=y if xΦ=yΦ for every homomorphism Φ:F(X)→S. Elements of X are regarded as variables and the words x and y are to yield equal members of S for every possible assignment of values of S to the variables. For example a free inverse semigroup satisfies either of the following systems of identities

(a) x=xx^{-1}x, (x-1^{-1}=x, x^{-1}xy^{-1}= y^{-1}yx^{-1}x;

(b) x=xx^{-1}x, (x^{-1})^{-1}=x, (xy)^{-1}=y^{-1} x^{-1} , xx^{-1} x^{-1}x=x^{-1} xxx^{-1} .

Free inverse semigroups are known not to be finitely presented. (Schein 1963).

Here we consider semigroups with finite generating sets with elements from the set of endomorphisms of some background structure. Unless indicated otherwise, for a set of n endomorphisms {β_{0},...,β_{n-1}} the embedding into F_{n} with β_{i}→i is assumed.
When functions are defined on sets for which they are automorphisms the existence of unique inverses allows for the representation of finitely presented semigroups. Relations can be treated as the (optional) insertion of the inverse of one side of an identity followed by the equivalent composition with the other.

For example in the semigroup (a,b,c,d;ab=c) the word acbab could be treated as ac(c^{-1}ab)bab(b^{-1}a^{-1})c=aabbc.

The representation of relations is developed in Section IV. Considering the systems of identities for free inverse semigroups it can be seen that if words are inverse free no identity applies. The following characterisation of the free inverse semigroup on a set X, denoted by FI_{x} due to Higgins (1992) explicitly partitions elements of X from their inverses as follows.

Let X be a non-empty set and let 1 be a bijection of X to a disjoint set X'. Consider the free semigroup F_{y} on Y=XoX'. Define on F_{y} a unary operation (•^{-1} ) whereby

y^{-1} ={ x' if y=x ε X,

x if y=x' ε X',

and

(x_{1}x_{2} ... x_{n})' =x_{n}'X_{n-1}' ...x_{1}' , x_{j}ε Y.

Now let ρ be the congruence on F_{y} generated by

{yy'y,y|yε F_{y}}o{yy' zz' ,zz'yy' |y,zε F_{y}}.

FI_{x}=(F_{y}/ρ,i) where i = ρ| |X.

Given a set X the construction of FI_{x} provides F_{x} and F_{x}, and an involution a on their union. All that is wanted here is that for words x,y in F_{x} or F_{x},, |xy|=|x|+|y|, which does not hold in general in FI_{x}. The groups and semigroups required for the construction of number systems can be developed within this
context. As it turns out there are numerical representations for free semigroups in sets of numerical objects equipped with a single operation. For this reason endomorphisms of semigroups rather than rings are emphasised. Recourse to structures with several operations only occurs when non linear morphs are considered.

Let {β_{1},...,β_{n}} be a set of endomorphisms on a set X.

An enumeration of freely written products of F({β_{1},...,β_{n}}) ordered by length for fixed n

1 β_{1},β_{2},...,p_{i,1}=β_{i},...,β_{n}

2 β_{1}β_{1},β_{1}β_{2} ,.....,β_{1}β_{i} ,...,β_{1}β_{n},β_{2}β_{1},...,p_{i,2},....,β_{n}β_{n}

.

. f

can be compared with one in which each term is replaced by the function defined by composition of functions with the order in which the products are taken strictly preserving the order in that term, that is elements of ({β_{1},...,β_{n}},o). Using p_{i,k} for the i^{th} free product of length k and b_{i,k} for the corresponding function, the corresponding enumeration takes the form

b_{1,1},b_{2,1},...,b_{i,1},...b_{n,1}

b_{1,2},b_{2,2},...,b_{i,2},..,b_{n}2_{,2}

.

.

b_{l,k},b_{2,k},.... ,b_{i,k},...., b_{1,k} 1=n^{k}

and note for the purpose of induction arguments that each term at length k gives rise to n terms at length k+1 hence the notation b_{(i,j),k} for the function that results from the composition b_{i,k}β_{j}.
If the functions β_{1}... β_{n} are distinct, that is β_{i}=β_{j} if and only if i = j then an enumeration of freely written products is an enumeration of F_{n}, the free semigroup over n symbols. If the resulting functions are also distinct, that is b_{j,k}=b_{j,l} if and only if i = j and k=l the bijection necessarily implied is here termed a labelling of F_{n}. If the elements of X can be treated as variables over objects over some number system then the term numerical labelling will be used. If there exists a factorisation algorithm for each label b_{i,k} which enables the recovery of the free product p_{i,k} the labelling is termed an encoding and the factorisation is a decoding, and a numerical encoding (decoding) if the labelling is numerical.

If b_{i,k}ob_{j,l}=b_{n(i-1)+j,k+ 1} for all i,j,k,l>1 then the encoding is a (numerical) representation.

For a free inverse semigroup FI_{x} let S denote either F_{x} or F_{x}'. S can be rendered commutative by the addition of the identity ab=ba.

Let (A_{n}, o)be the semigroup generated by the set of n functions {α_{j},...,α_{n}} = #(S^{n}), n>1, of which the i^{th} function is defined by

(a_{1},a_{2}...a_{i}...a_{n})α_{i} =(a'_{1},a'_{2}...a'_{i}...a'_{n})

a'_{j}=a_{j}a_{i},i≠j

a'_{i}=a_{i}.

Let (B_{n},o) be the semigroup generated by the set of n functions {β_{1},...,β_{n}}, = #(S^{n}), n>1 of which the i^{th} function is defined by

Proposition 2.1 When S is commutative the semigroups A_{n} and

B_{n}, n>1, are noncommutative and the sets {α_{1},...,α_{n}},

{β_{1},...,β_{n}} are ant i commutative.

Proof: If S is noncommutative there is little to prove.

Assume S is a commutative. For arbitrary integers i,j,l≤i, j≤n,i≠j compare the products α_{j}α_{j} and α_{j}α_{i}.

(a_{1},a_{2},...,a_{i},...,a_{j},...,a_{n})α_{i}α_{j}

= (a_{1}a_{i},a_{2}a_{i},...,a_{i},...,a_{j}a_{i},...,a_{n}a_{i})α_{j}

=a_{1}a_{i} ^{2}a_{j},a_{2}a_{i} a_{j},...,a_{i} ^{2}a_{j},...,a_{j}a_{i},...,a_{n}a_{i} ^{2}a_{j}

(a_{1},a_{2},...,a_{i},...,a_{j},...,a_{n})α_{j}α_{i}

= (a_{1}a_{j},a_{2}a_{j},...,a_{i}a_{j},...,a_{j},...,a_{n}a_{j})α_{i}

=a_{1}a_{i}a_{j} ^{2}'a_{2}a_{i}a_{j} ,...,a_{i}a_{j},...,a_{i}a_{j} ^{2},...,a_{n}a_{i}a_{j} ^{2}

α_{i}α_{j}≠α_{j}α_{i}.

F

β_{i}β_{j}≠β_{j}β_{i}. ■

Let be the semigroup generated by the set of

functions γ_{1},γ_{3}...γ_{i}...γ_{n},n>1 of which the i^{th} is defined by

(a_{1}'a_{2}...a_{i}...a_{n})γ_{i}=(a'_{1},a'_{2}...a'_{i}...a'_{n})

a'_{j}=a_{j} for j≠i

a'_{i}=a_{i}b where b is an a^ such that

g(b)=sup({g(a_{i})}\{g(a_{i})}).

Proposition 2.2 The semigroup C_{n} generated by {γ_{1},...,γ_{n}}, n>1, is strictly noncommutative.

Proof: Since monogenic semigroups are commutative, calculation of product of compositions takes the form
(a_{j}, ...,a_{i}, ...,a_{j}, ...,a_{n} )γ_{i}γ_{j}=(b_{l}, ...,b_{n})

= (a_{l},...,a_{i},s,...,a_{j},...a_{n})γ_{j}

with s any a_{k}, k≠i for which g(a^{k})=suρ({g(a_{l})}\{g(a_{i})}), 1≤l≤n.

=a_{l},...,a_{i}b,...,a_{j}a_{i}b,...,a_{n}

with g(l_{j}) strictly greater than g(b_{k}), k≠j.

Similarly (a_{l},...,a_{n})γ_{j}γ_{i}=(b_{1},...,b_{n})

with g(b_{i}) strictly greater than g for any other entry, hence γ_{i}γ_{j}≠γ_{j}γ_{i}, i≠j. ■

A function g:F_{n}→N with g(xy)=g(x)+g(y) can be defined for the free products by setting g(x)=|x|, the length of the word x. Extension of g to the objects b_{i,j} which result from

performing functional composition can be justified when the generators β_{1}...β_{n} form a set of irreducible elements under that operation, in which case the length λ(a) defined by λ(a)=k if a=p_{1}p_{2}...p_{i}...p_{k}, p_{i} in {β_{1},...β_{n}}. If a unit (identity function) β_{0} is adjoined set λ(β_{0})=0. Under these conditions λ(a) is well defined and interchangeable with |a|. The examples of the next section can be readily seen to fulfil these conditions.

NUMERICAL REPRESENTATIONS.

Consider the homomorphism σ: (F_{x})^{n}→(N,+)^{n} defined by x_{i}→lx_{i} | for each component x_{i} of x€(F_{x})^{n}. For example t=α_{1}α_{3}α_{2}α_{2}α_{4}εA_{5}

a=a_{1},a_{2},a_{3},a_{4},a_{5}

a α_{1}=a_{1},a_{1}a_{2},a_{1}a_{3},a_{1}a_{4},a_{1}a_{5}

aα_{1}α_{3}=a_{1} ^{2} a_{3}, a_{1} ^{2} a_{2}a_{3},a_{1}a_{3},a_{1} ^{2}a_{3}a_{4},a_{1} ^{2} a_{3}a_{5}

aα_{1}α_{3}α_{2}=a_{1} ^{4}a_{2}a_{3} ^{2}, a_{1} ^{2}a_{2}a_{3},a_{1} ^{3}a_{2}a_{3} ^{2} , a_{1} ^{4}a_{2}a_{3} ^{2}a_{4},a_{1} ^{4}a_{2}a_{3} ^{2}a_{5}

aα_{1}α_{3}α_{2}α_{2}=a_{1} ^{2}a_{2} ^{2}a_{3} ^{3}, a_{1} ^{2}a_{2}a_{3}, a_{1} ^{5}a_{2} ^{2}a_{3} ^{3}, a_{1} ^{6}a_{2} ^{2}a_{3} ^{3}a_{4}, a_{1} ^{6}a_{2} ^{2}a_{3} ^{3}a_{5}

aα_{1} α_{3}α_{2}α_{2} α_{4} = a_{1} ^{12}a_{2} ^{4}a_{3} ^{6}a_{4} ,a_{1} ^{8}a_{2} ^{3}a_{3} ^{4}a_{4},a _{1} ^{11}a_{2} ^{4}a_{3} ^{6}a_{4},a_{1} ^{6} a_{2} ^{2}a_{3} ^{3}a_{4},a_{1} ^{12}a_{2} ^{4}a_{3} ^{6}a_{4}a_{5}

= at

ato=23,16,22,12,24 when | a_{i} | = 1.

The natural numbers under addition is F_{1}, and commutative by virtue of being monogenic. The n-tuples over N that arise from this map can involve repeated values so that the concept of a multiset is useful.

The following properties of multi sets of nonzero positive real numbers, although too simple to require proof, are relied on hereafter. In this context 'supremum' and 'infimum' will be used to nominate a unique element of a set in which repeated values may occur, this usage is an analogy to the infinite case. Sup S, if it exists, is an element strictly greater than any other element of S, inf S, if it exists, is strictly less than any other. For example P={1,1,2,2} has neither a sup(P) nor inf(P) although minP=1 and maxP=2 and Q={1,2,2,3} has supQ=3, infQ=1, maxQ=3, minQ=1.

Let P and Q be multi sets of nonzero positive real numbers. PQ1 ) If P' is constructed from P by selecting any element x and adding it to every element of P other than itself then x is the inf of P'. PQ2) If P' is constructed from P by selecting an element x of P and forming x' from x by adding to it every element of P other than itself then x' is the supremum of P'.

PQ3) If P' is constructed from P by selecting any element x and forming x' by adding to x any element equal to the maximum of P then x' is the supremum of P'.

PQ4) If x is any element of P equal to the maximum of P and y any element of Q then if a set Q' is constructed from Q by replacing the element y with x+y then maxQ>maxP.

PQ5) If maxP≥maxQ then if x is any element of P with x=maxP and y is any element of Q then if Q' is constructed from Q by replacing the element y with x+y then that element is the supremum of Q', that is supQ'=x+y.

Let S be a semigroup and t be an automorphism on S. The pair <S,t> will be said to have the PQ property if there exists a function g:S→R^{+} such that for all a,b in S g(ab)>g(a)and g(ab)>b.

For aεN^{n} denote by {a} the multiset from the components of a and {a}-a_{i} for the multiset from the components of a other than the i^{th}.

The transformation aα_{i} leaves a_{i}'σ<a_{j}'σ, j≠i. The transformation aβ_{j} results in a_{i}'σ>a_{j}'σ,i≠j. Given that aσα_{i} has been performed it follows that i can be determined simply by determining which component of a' has the least value. Similarly given that aσβ_{i} has been performed i can be determined by locating the component of a' with the greatest value. This mechanism can be exploited to define tests for divisibility.
Let τεA_{n} and a=1_{n}τσ then τ is of the form τ'α_{i} if and only if a_{i} = inf {a}.

If τεB_{n} and a=1_{n}τσ then τ is of the form τ'β_{i} if and only if a_{i}=sup{a}.

Given as an algorithm for recovery of the sequences of generators of A_{n} that have acted on a.

Example a=1 1 1 1 1

Performing aα_{1}α_{3}α_{2}α_{2}α_{4}

aα_{1} =1,2,2,2,2

aα_{1}α_{3}=3,4,2,4,4

aα_{1}α_{3}α_{2}=7,4,6,8,8

aα_{1}α_{3}α_{2}α_{2}=11,4,10,12,12

aα_{1}α_{3}α_{2}α_{2}α_{4}=23,16,22,12,24

Example a=1,1,1,1,1

τ=β_{1}β_{3}β_{2}β_{2}β_{4}εB_{5}

Performing aβ_{1}β_{3}β_{2}β_{2}β_{4}

aβ_{1}=5,1,1,1,1

aβ_{1}β_{3}=5,1,9,1,1

aβ_{1}β_{3}β_{2}=5, 17,9,1,1

aβ_{1}β_{3}β_{2}β_{2}=5,33,9,1,1

aβ_{1}β_{3}β_{2}β_{2}β_{4} = 5,33,9,49,1

The existence of unambiguous tests for divisibility of α_{i} and β_{i} is sufficient to give the result.

Theorem: For all n>1, and =

Other semigroups of automorphisms will be shown to be isomorphic to F_{n} however it is not sufficient to isolate sets
of generators that are anticommutative as the following example shows.

For a=a,b,c and {θ_{1},θ_{2},θ_{3}}, an anticommutative set of automorphisms defined by

aθ_{1}=a,ba,c

aθ_{2}=(a,b,c)θ_{2}=a,b,cb

aθ_{3}=(a,b,c)θ_{3}=ac,b,c.

Calculating aθ_{1}θ_{2}θ_{2}θ_{1} yields

aα=a,ba, c

3θ_{1}θ_{2}=3, ba,cbs

aθ_{1} θ_{2} θ_{2} = a, ba, cbaba

aθ_{1}θ_{2}θ_{2}θ_{1}=a, bab, cbaba

Calculating aθ_{2}θ_{1}θ_{1}θ_{2} yields

aθ_{2}=a,b,cb

aθ_{2}θ_{1}=a,ba,cb

aθ_{2}θ_{1}θ_{1}=a,baa,cb

aθ_{2}θ_{1}θ_{1}8_{2}=a,baa,cbbaa

so that aθ_{1}θ_{2}θ_{2}θ_{1}≠aθ_{2}θ_{1}θ_{1}θ_{2}, however

when {a,b,c} commutes

aθ_{1}θ_{2}θ_{2}θ_{1}=aθ_{2}θ_{1}θ_{1}θ_{2}=a,3^{2}b,3^{2}b^{2}c.

Define functions γ_{i}, 1≤i≤n, by

a_{j}'=a_{j} j≠i

a_{i}'=a_{i}+ max {a}-a_{i}.

Let G_{n} be the semigroup generated by {γ_{1},...γ_{n}}.

It is central to what follows to observe that after performing aγ_{i} the component a_{i}' is the sup {a}.
Example a=1,1,1,1,1

Performing aγ_{1}γ_{3}γ_{2}γ_{2}γ_{4}

max {a}-a_{1} = 1

aγ_{1}=2,1,1,1,1 sup (aγ_{1}}=a_{1}'

max {aγ_{1}}-a_{3}=2

aγ_{1}γ_{3} = 2,1,3,1,1 sup {aγ_{1}γ_{3}}=a_{3}'

max {aγ_{1}γ_{3}}-a_{2}=3

aγ_{1}γ_{3}γ_{2}=2,4,3,1,1 sup {aγ_{1}γ_{3}γ_{2}}=a_{2}'

max {aγ_{1}γ_{3}γ_{2}}-a_{2}=3

aγ_{1}γ_{3}γ_{2}γ_{2}=2,7,3,1,1 sup {aγ_{1}γ_{3}γ_{2}γ_{2}}=a_{2}'

max {aγ_{1}γ_{3}γ_{2}γ_{2}}-a_{4}=7

aγ_{1}γ_{3}γ_{2}γ_{2}γ_{4}=2,7,3,8,1 sup {aγ_{1}γ_{3}γ2γ2γ_{4}}=a_{4}' .

Set b=aγ_{1}γ_{3}γ_{2}γ_{2}γ_{4} = 2,7,3,8,1.

Given that b=aτ, τεG_{n} then τ can be recovered from b because determination of i for which b_{i}=sup{b} is unsmbiguous and is a test for (post) divisibility by γ_{i} and γ_{i} ^{-1} is well defined by

(b_{1},... b_{n})γ_{i} ^{-1}=b_{'} b_{i}=b_{i}-sup{b}-b_{i}.

Given as an algorithm recovery of the sequences of generators of G_{n} that have acted on a becomes

Step 1. Accept b

Step 2. If b_{i} is the sup{b} emit i

Step 3. Set b_{i}=b_{i}- sup{b}-b_{i}

Step 4. If b≠a go to Step 2 else end.

The output sequence of integers clearly index the functions γ_{i} in reverse order to their application. However assume an embedding of the generators {γ_{1},...γ_{n}} of G_{n} into F_{n} having particularised F_{n} as sequences on the alphabet {0,1... n-1}.
The algorithm given, modified at Step 2 to read "emit i-1" now emits sequences in F_{n}.

Again with this embedding in mind the procedure for performing aτ can be given in terms of elements of F_{n}, as an algorithm.

Step 1. Accept x=X_{1}...x_{|x|}εF_{n}

Step 2. Set a=1_{n} , i=1

Step 3. Set a_{j}=a_{j}+m3x{a}-a_{j}

where j=x_{i}+1

set i=i+1

Step 4. If |x| then end, else go to Step 3.

The unique f sctorisstion algorithm for τ in G_{n} is sufficient to justify the following.

Theorem 2.1 For n>1 G_{n} is isomorphic to F_{n}.

Several systems of automorphism have now been shown to be isomorphic to finitely generated free semigroups. These are now developed into encoding and decoding procedures by defining encoding and decoding functions.

Hereafter the n automorphisms generating a semigroup are indexed 0,...,n-1 and Γ_{n}={θ_{0},...,6_{n-1}} is generic for the several semigroups so generated and shown to be isomorphic to

An encoding function is a function e:N^{n},i→N^{n} defined by a_{,}i_{→}aθ_{i} with a in N^{n} and where i is a symbol from the particularisstion of F_{n} as F({0,...,i,...,n-1}) with e extended to F_{n} by

e(a,x)=e(...(e(e(a,x_{1}),x_{2})...)x_{n})

x=x_{1}...x_{n}εF_{n},x_{i} in {0, ...,n-1}.

An anti-encoding for x=x_{1}...x_{n} can be defined by

e(a,x)=e(...e(e(a,x_{n})x_{n-1}...)x_{1}).
Let E(a)cN^{n} defined by yεE(a)→y=e(a,x) for some word x in F_{n}. A decoding function d:E(a)→{0,...,n-1} defined by d(b)→bθ_{i} ^{-1}, i where bεE(a) and i is selected by a divisibility test defined in Γ_{n} and extended by repeated application until b'=a.

An encoding function is fully specified when n, Γ_{n},

Φ: {θ_{0},...,θ_{n-1}}→{0,...,n-1} and a are fixed. Hereafter 3 standard encoding over Γ_{n} will be the encoding function with a=1_{n}, Φ defined by θ_{i}→i.

Denote by R_{n}, n>1 the set of relstive prime n-tuples in (N^{0})^{n}. When n=2, R_{2} can be identified as Q^{+} the positive rationsl numbers.

Assigning a highest common factor to pairs of integers defines an associstive, commutative binary operation under which the positive integers are closed. In the resultant semigroup the integer 1 acts as a zero as (1,x)=1 for all x in Z so that 1x=x1=1. By setting (0,x)=x, 0 acts as the identity since the implied product is 0x=x0=x for all x in Z. Denote by H the monoid with zero of N^{0} under this operation. The sets R_{n} above can be defined as the set of free words w of length n in H with w=1. If addition is allowed in H the operations interact by the identity a(a+b)=ab reflecting that (a,b)=(a,a+b). As a consequence the standard encodings of F_{n} from l_{n} over A_{n}, B_{n} and C_{n} are surjections into R_{n}.

The homomorphism σ: (F_{x})^{n}→(N, +)^{n} with x_{j}→|x| will be termed 'standard'.
The sets of automorphisms R_{n} and B_{n} are readily seen to be linear transforms, their matrix representation is treated in the next section. The transforms of C_{n} must be redefined to make linearity apparent in a manner suitable for matrix representation.

Let D_{n}cAut (N^{n}) be the set of n^{2}-n functions δ_{ij}, 1≤i, j≤n, i≠j of which the i,j^{th} is given by

(a_{1},...,3_{i},...,3_{j},...,3_{n})δ_{ij}=(a_{1},...,a_{i},...,a_{j}' ,...,a_{n})

where a_{j}'=a_{i}+a_{j}. Functions δ_{i,j} and δ_{k,l} commute if and only if j≠k and i≠1.

Informally D_{n} will be used to represent the transitions between symbols in free words and vertices of graphs.

Let G_{n} be the semigroup on {i, j | l≤i, j≤}u{0} with a product defined by (i,j)(k,1)=0 if j≠k else (i,1).

As 0 acts as a zero in T_{n} any word x=x_{i}...x_{n}, x_{i}=(j_{i},k_{i}) in which x_{i}x_{i+1}=(j_{i},k_{i}) (j_{i+1},k_{i+1}) with k_{j}≠j_{i+1} occurs is identically 0.

Denote by N(T_{n}) the set of nonzero words of F(T_{n}). Clearly N(T_{n}) is not closed under the product of T_{n}. However there is a bijection between N(T_{n}) and F_{n}, constructed as follows.

1) Given xεF_{n}, x=x_{1}...x_{n} form the word x_{0}sp(x) where x_{0} is a symbol in the alphabet for F_{n}, X_{0}≠X_{1}. A suitable rule for selecting x_{0} given the alphabet 0,...,n-1 is;

set x_{0}=(X_{1}+L) mod L. For example x=0^{3}1^{2}2^{1}0^{2} x' is 10120.

2) Construct the sequence of pairs p=p_{1},...,p_{m}, p_{1}=x_{i}',x_{i+1}' m=|x|. For the example p=(1,0)(0,1)(1,2)(2,0).

3) Construct the word y in D_{n}

y=y_{1} ^{el}...y_{m} ^{em} with y_{j}=δ_{pi} ^{ei}, ei the i^{th} component of exp(x). For the example 0^{3}1^{2}2^{1}0^{2}→δ_{1,0} ^{3}δ_{0,1} ^{2}δ_{1,2}δ_{2,0} ^{2}.
Define the mapping τ_{n}: [n]x[n]→E_{ij} of the integer pairs (i,j), 0<i, j≤n-1 into the elementary matrices of order n by taking τ(i,j) to be the n-square matrix with the entries of the major diagonal set to 1 and the i,j^{th} entry set to 1. If i=j the map is to the identity matrix of order n. The range of τ is the elementary matrices E_{D} of order n which act on an arbitrary nxm matrix so that E_{ij} X, i≠j, is derived from X by adding the j^{th} row to the i^{th} row. Products in E_{n}, E_{ij}E_{k1} commute if and only if j≠k or i≠1. By mapping δ_{ij}→E_{ij} a matrix representation that provides a numerical encoding equivalent to C_{n} is achieved.

The resulting expression for the free product can be interpreted as representing the transition, left to right, between successive symbols of x.

In the encoding and decoding procedures described the components of 3 act as variables. The Initialisation for encoding could be reworded "set variables a_{1},...,a_{n} to 1.". So viewed the encodings treated above involve n variables for F_{n}. By disengaging the number of variables from the order of the alphabet it is possible to associate encoding of F_{n} with walks on a digraph of m vertices, m≥n.

To generate a numerical representation of 3 particular finite state transducer assume that

1. a function T:QxS→Q of ststes by (input) symbols to states from which is derived

2. a function QxS→O of states by input symbols to output symbols is given either explicitly (by tables) or implicitly by some rule (eg. T(q_{c},s_{i})=q_{i}).

Tske Q to be 3 set of q states q_{1},...,q_{q} one of which is a designsted start state q_{0} and S to contain s symbols s_{1},...,s_{s}.
Set Q to be 3 set of q variables q_{1},...,q_{q} initially set to some nonzero value, d is a variable not in Q initialised to the initial value of the variable corresponding to the start state.

On accepting the first input symbol set the variable corresponding to T(q_{0},x_{1}) to T(q_{0},x_{1})d. For each input thereafter set T(q_{c},x_{i})=T(q_{c},x_{i})d.

The value of d is changed when T(q_{c},x_{i})≠q_{c}, q_{c} being the current state, is set to the value of the variable corresponding to q_{c}.

SECTION III.

Denote by M_{n} the semigroup comprising all n-square matrices of determinant 1 over N^{0}, excluding the identity matrix of order n, under matrix multiplication. When the identity element I_{n} is to be included the resulting monoid is denoted by M_{n} ^{1}.

Define the mapping τ_{n}: [n]x[n]→E_{ij} of the integer pairs

(i,j),i,j≤1≤n into the elementary matrices of order n by taking τ(i,j) to be the n-square matrix with entries of the major diagonal set to 1 and the i,j^{th} entry set to 1. If i = j the map is to the identity matrix of order n.

Denote by E_{n} ^{1} the monoid generated by the n^{2}-n+1, n-square matrices of determinant 1 determined by τ_{n}, E_{n} when the identity is excluded. Enumeration for E_{n} takes the form

.

.

. .

Denote by A_{n} the semigroup generated by the n matrices, A_{i,n} under matrix multiplication. Where

Enumeration for A_{n} takes the form

Denote by B

Enumeration for B_{n} takes the form

Denote by C_{n} the semigroup generated by the n matrices C_{i,n} under matrix multiplication, where \ n

Enumeration for C_{n} takes the form

For the case n=2 then

E_{2}=A_{2}-B_{2}-C_{2}- Hence the binary case is

distinguished and the semigroup {α,β}, under

matrix multiplication is hereafter denoted by B (without subscript).

Theorem: 3.2 If Q^{+} is the set of ordered pairs of relatively prime positive integers and M_{2} is the 2-square matrices of determinant 1 over N^{0} then the function q:Q^{+}→M_{2} defined below is a bijection.

For (x,y) in Q^{+} with x,y>l

Where [n]_{n} ^{-1} the unique integer k in the representative class 0 to m-1 for which nk≡1 mod m. For x=1 or y=1.

Proof:

Recalling [α]_{β} ^{-1} exists → (α,β) = 1

and [α]_{β} ^{-1}+[-α]_{β}-1=β.

Injection:-

→ [y]_{x} ^{-1}+[-y]_{x} ^{-1}=x=a+b

and [v]_{u} ^{-1}+[-yl_{x} ^{-1}=u=a+b

→ x=u.

Similarly

[-x]_{y} ^{-1}+[x]_{y} ^{-1}=y=c+d

and [-u]_{v} ^{-1}+[u]_{v} ^{-1}=v=c+d

→ y=v,

Surjection:-

If is in M_{2} then ad-bc=1

→ (a,b)=1

→ (a+b,b)=1 and (a,a+b)=1
→ [a]_{a+b} ^{-1} and [b]_{a+b} ^{-1} exist.

b=-3 mod a+b

→ [b]_{a+b} ^{-1}= [-a]_{a+b} ^{-1}

[a]_{a+b} ^{-1}+ [-a]_{a+b} ^{-1}=a+b

Similarly

[-d]_{c+d} ^{-1}+[d]_{c+d} ^{-1}=c+d

Hence =q(a+b,c+d).

Again from 3d-bc=1

(a+b) (c+d)-bc=3(c+d)+bd and (a+b) (c+d)-bc=c(a+b)+bd+1

→ a(c+d)-c(a+b)=1

→ a+b,c+d ε Q^{+}.

This proof establishes that r:M_{2}→Q^{+}, r=q^{-1}

Example 3.2

Theorem 3.3: M

that is M_{2}=E_{2},

Proof: If M_{7} is generated by {α,β} then for all in M_{2}

A≠I_{2} and either Aα^{-1}ε M_{2} or Aβ^{-1}ε M_{2}, -

that is either

Case 1. a=b or c=d.

a=b

ad-bc=1 → (a,b) = 1 → a=b=1

and A is in the form

Similarly

c=d → A is in the form

and Aβ^{-1}ε M_{2}.

Case 2. Neither a=b nor c=d.

3d-bc=1 → a/(c+1/b)=b/d

b ε N→ 1/b<1 hence

a>b→ c>d and

a<b → c<d

→ either Aα^{-1} or Aβ^{-1}ε M_{2} ■

This result guarantees that matrices in M_{2} can be rewritten as a finite sequence of the generators α and β. Moreover the divisibility test for (pre)division by generators has been shown to be exclusive, that division is alwsys possible by
either α or β not both. These are the requirements for unique factorisation into sequences of the two generators up to and including order. A factorisation algorithm takes the form: if the sum of the top row is greater than the sum of the bottom row divisible by a else divisible by β division is performed as follows

if divisible by α subtract the top row from the bottom, for β subtract the bottom row from the top. This process can be continued until the row sums are equal which can only occur for the identity.

The set of all finite sequences over {α,β} is just another name for F_{2} and associativity guarantees that matrix multiplication in M_{2} can be translated as concatenation in F_{2}. These observations are collected as

Theorem 3.4

The bijection of M_{2} onto Q^{+} can be used to induce a non commutative operation on positive real numbers in the obvious way. The circled multiplication ⊕ will be reserved for this operation defined by

(a,b)⊕(c,d)=q(r(a,b),r(c,d))

which can be computed directly in Q^{+} by

where b*=[b]_{a} ^{-1}, a*=[a]_{b} ^{-1}, D=c-d.

consequently

Theorem 3.5 M_{2} (Q^{+},⊕)

F

Tsbles 3.1, 3.2 illustrate the relationship between rational numbers, matrices and generators.

Table 3.1 The mapping q of relative prime pairs a,b to square matrices over N^{0} of determinant 1 for a,b≤10.

Table 3.2 The products of matrices α and β for each entry in table 3. 1.

Definition: For x an arbitrary n-square matrix denote by x the matrix kx^{T}k where k is the n-square matrix with each entry in the minor diagonal set to 1 and 0 elsewhere .

For n=2 k = k has the properties k^{2}=I and k^{T}=k .

For x,y arbitrary square matrices of equal order

(xy)^{G}=k(xy)^{T}k

=ky^{T}x^{T}k

=ky^{T}kkx^{T}k

=y^{G}x^{G}.

In the present case α^{G}=α and β^{G}=β hence (αβ)^{G}=βα and (βα)^{G}=αβ, consequently this operation on an arbitrsry finite product of n terms x_{1}x_{2}...x_{i}...x_{n}, x_{i}=α or x_{i} = β can be seen to provide the result of the product taken in the reverse order

x_{n}, x_{n-1}...x_{i}...x_{1} as follows

(x_{1}...x_{n})^{G}=((x_{1}...x_{n-1})(x_{n}))^{G}

=x_{n}(x_{1}...x_{n-1})^{G} =x_{n}x_{n-1}(x_{1}... x_{n-2})^{G}

..

..

=x_{n}x_{n-1}...x_{1}.

For example it can be checked by calculation or from the tables that a^{2}βα=

Equivalently 0010→ the anti encoding of

0010. From this relation it readily follows that

r((q(a,b))^{G}) = [a]_{a+b} ^{-1}, [b]_{a+b} ^{-1}. So for example the binary

sequence associated with the rational number 8/3 is the binary sequence, in reverse order, associated with the rational number 7/4. The term arithmetic inverse of a rational number is proposed for this operation which can be defined in Q^{+} by [a/b]^{-1}=[a]_{a+b} ^{-1}/[b]_{a+b} ^{-1}.
When a matrix X in M_{2} has the property X^{G}=X the associated binary sequence must be a palindrome and in the rationals

[a/b]^{-1}=a/b will hold. When [a/b]^{-1}=b/a then the associated binary sequence will have the property that taking its compliment yields the sequence in reverse order. Transposing

XGM_{2} gives the matrix for the compliment in reverse order.

Examples of these and other relationships necesssrily implied can be found in the tables above.

The progress of an encoding can be conveniently displayed by diagrams such as

which displays the encoding e((1,1), 01101) with the

transforms similarly displayed as

(a,b) -0→ (a,a+b)

(a,b) -1→ (a+b,b)

Which are arbitrarily selected for the standard encoding.

The behaviour of some 'non standard' encodings is illustrated this way below.

Does not discriminate between initial bits. However encoding from 2/1 is without ambiguity. Decoding is achieved with the test 'if a/b>2 emit 0 else emit 1'. The Symmetrical pair,

(a,b) -0→ (a,a+b)

(a,b) -1→ (b,a+b)

have, mutstis mutandis, the same properties. The two pairs can be combined in an algorithm which yields the arithmetic inverse of the standard encoding as follows:

If the first bit is zero encode using the pair

(a,b) -0→ (a,a+b)

(a,b) -1→ (b,a+b), encode from 1,2.

If the first bit is one encode using the pair

(a,b) -0→ (a+b,a)

(a,b) -1→ (a+b,b), encode from 2,1.

Take the first derivative of the input sequence as the sequence to be encoded.

Example 01101 has the first derivative 1011.

The first bit of the originsl sequence is zero so the encoding proceeds

Compare with the standard encoding

.The following algorithm outputs a pair a,b which decodes in the input order, without explicitly developing 3 2x2 matrix.

Step 1. Set s=0 if first symbol is "0"

else s="1"

Step 2. Initialise six variables

a=1, b=2, u=0, v=1, x=1, y=1

Step 3. If next input = last input then a=a+u

b=b+v

Else

Set u=x, v=y, x=a, y=b.

Set a=a+u

Set b=b+v

Step 4. If last input then

if a=1 swap the values of 3 and b.

Emit a and b.

If not last input accept next

Goto Step 3.

The function q can be used to explicate consideration of the decomposition of rationsls into partisl fractions. The relationship with decomposition into distinct egyptian (unit) fractions is immediate since in order that such 3 process terminates the final step is a subtraction of the form a/c - b/d constrained by the requirement that ad-bc=1.

This procedure generalises to enable encoding sequences over k different symbols as follows:

Step 1. Assign to each of the k symbols 3 distinct integer in the range 0 to k-1. For example over 7 symbols 1,0,6,5,3,1,2 would be 3 possible sequence.

Assign to the sequence a pair of integers

b=k^{v}

where

a and b divided by (a,b).

This expression is equivalent to forming the sums of the form C_{1}/k^{1} + c_{2}/k^{2}...c_{n}/k^{n}.

For example the sequence 1,2,0,5,3 with k=6 of length n=5 is assigned the partial fraction / / /

The rational number is "decoded" as follows

Step 1. Accept a,b

Set n=1

Step 2. Set m=k-1

Step 3. If m/k^{n} is less than or equal to the rational number a/b then

Emit m

Set n=n+1

Goto Step 2

Else if m>1

Set m=m-1, repeat Step 3

Else if m=1

Emit 0

Set n=n+1

If n=string length then end

Else goto Step 2. SECTION IV.

In this section the relationship between finite presentation and numerical representstion is examined. Existing techniques for representing operations on free words include the graph theoretic semigroup diagrams defined as follows:

Definition. A semigroup diagrsm is a planar connected digraph Γ embedded in R^{2}, labelled by a function which assigns to each oriented edge of r an element of F_{x}.

(i) Each component of S^{2}-Γ=(R^{2}u∞)-Γ is two-sided, mesning that it is bounded by a clockwise cycle of the form αβ^{-1}, where α and β are non-empty paths (that is, psths which traverse edges only in the direction of their orientation; β^{-1} denotes the reverse of β).

(ii) Γ hss exactly one source, 0, and one sink I, each of which lies on the unbounded component of S^{2}-Γ.

If M is a diagram with underlying graph Γ, then the bounded components of S^{2}-Γ are the regions of M. The number of regions of M is denoted by |M||. If D is a region of M with clockwise boundary cycle αβ^{-1} as in (i) above, then α[β] is called the left [right] boundary of D. The labelling function Φ extends to a function of non-empty directed paths and takes values in the free semigroup F_{x}. The left and right boundary labels of α and β are then r=Φ(α) and s=Φ(β) respectively, that is δD, the boundsry of D, hss lsbel rs^{-1}.

A similar terminology applies to the whole diagram M. If D is the unbounded region of S^{2}-Γ, and αβ^{-1} is a clockwise boundary cycle of D as in (i) above, then βα^{-1} is s clockwise boundary
cycle of S^{2}-intD. Refer to β[α] as the left [right] boundary of M. Also, Φ(β)[Φ(α] is the left [right] boundary label of M. The source 0 of a semigroup diagram is unique, and is the unique transmitter of the diagram; dually, the unique sink is its unique receiver.

Let S be a semigroup with presentation (X;R) where, without loss, assume that R is symmetric. A diagram M over (X;R) is one in which each interior region hss left boundary label r and right boundary label s, where {r,s}εR; M is a (u,v)- diagram for the pair (u,v), where u and v are the left and right boundary labels of M respectively. M is also termed a derivation diagram over (X;R) for the pair (u,v) and extended as the (u,v) -diagram for any sequence of elementary R- transitions u≡u_{0}→u_{1}→...→u_{n}≡v(u,vεF_{x}). Here is used u≡v if u and v represent the same member of F_{x} and write u=v if u and v are equal in S.

When functions are defined on sets for which they are automorphisms the existence of unique inverses allows for the representation of finitely presented semigroups.

Example 4.1 For the semigroup (a,b,c,d;ab=c) the derivation diagram for acbab=a^{2}b^{2}c by acbab=a^{2}b^{2}ab=a^{2}b^{2}c can be displayed as below

Relations can be treated as the (optional) insertion of the inverse of one side of an identity followed by the equivalent composition with the other. The derivation exploiting inverses becomes ac(c^{-1}ab)bab(b^{-1}a^{-1}c)=ssbbc.

The left [right] graph LG(X;R) [RG(X;R)] associated with the presentation (X;R) has vertex set X and edge set R. A defining relation {r,s} joins the initial [terminal] letters of r and s. Loops and parsllel edges may occur in either graph.

Example: 4.1 The left and right graphs of the presentation S=<a,b,c,d; abd=dba, adb=cad, aba=ad^{2}, bab=c, cab=acb> are respectively:

A walk of length m≥0 in sn undirected grsph is determined by 3 sequence of vertices a_{0},a_{1}...a_{m} end edges e_{1},...e_{m} such that e_{i} runs between a_{i-1} and a_{i}(i = 1,...,m). The walk is closed if a_{0}=a_{m} and is s cycle if it is a closed trail, m≥1; and the vertices
B_{1}..,a_{m} are all different. (In particular, any loop is a cycle.) Wslks and cycles LG(X,R) are cslled left wslks and left cycles respectively for (X;R). The pair (X;R) is cycle- free if (X;R) hss no left and no right cycles.

Define the mspping τ_{n}: [n]x[n]→E_{ij} of the integer pairs (i,j), 0≤i,j≤n-1 into the elementsry matrices of order n by tsking τ(i,j) to be the n-square matrix with the entries of the major diagonsl set to 1 and the i,j entry set to 1. If i=j the msp is to the identity matrix of order n. The rsnge of τ is the elementsry matrices E_{n} of order n which set on an arbitrsry nxm matrix so that E_{j}, X, i≠j, is derived from X by adding the j^{th} row to the i^{th} row. Products in E_{n}, E_{ij}E_{kl} commute if and only if j≠k or i≠l. This fact suggests a strategy for developing sets of generators that 3re non commutstive. Consider the set of n matrices A_{n} in which A_{i,n} is defined by

Products in A_{n} such ss A_{i}A_{j} inevitsbly involve 3 product E_{ij}E_{jk} as E_{ij} is 3 fsctor of A_{i} and E_{jk} is s fsctor of A_{j} 3nd as the fsctors of A_{i} commute smong themselves as do the fsctors of A_{j}, A_{i} can be written with E_{ij} to the right and A_{j} can be written with sn E_{jk} to the left.

In the product tables below for n=2,3,4 and 5 commuting pairs are indicated by 0 and noncommuting pairs by 1.

SECTION V.

Consider μ: (F_{x})^{n}→M(P^{+}), M (P^{+}) the collection of (finite) multisets over sets of nonzero positive real numbers.

Variables q_{i} corresponding to states can be replaced by a finite collection of variables of any nonzero order provided only that both sup q_{i} and inf q_{i} are well defined. When varisbles are split in this way subvariables are initialised to some value and the variable d (which is not split) is set to be an element of q_{0} determined by rule or table and the product T{q_{c},x_{i})d is a product of d with an element of T (q_{c},x_{i}) determined by rule or table.

Let ρ be the set of finite sets of real numbers and A a binary operation on R. Call a function σ a selection function if σ:P→p with Pερ and pεP. Let σ_{1} and σ_{2} be selection functions.

For P_{i},P_{j} in ρ the product of P_{i}P_{j} is given

P_{i}P_{j}={P_{j}\(P_{j})σ_{2}}u{(P_{i})σ_{1}Δ(P_{j})σ_{2}}.

Example 3.4 P_{1}={1,1,1,1}, P_{2}={1,1,1} σ_{1} selects any element equal to supP. σ_{2} selects any element equal to inf P.

( a+b if a≤1

aAb= { else

( a+b-1

A numerical representation that can always be faithfully decoded is as follows.

If 0

replace an infimum of A by the A product of a supremum of B and an infimum of A.

If 1

replace an infimum of B by the sum of a supremum of A and an infimum of B. eg. set A={1,1,1,1},B={1,1,1} 3 numerical representation for

10010 proceeds by :-

({1 1 1 1}, {1 1 1})1

({2 1 1 1}, {1 1 1})0

({2 1 1 1}, {2 1 1})0

({2 1 1 1}, {2 2 1})1

({2 3 1 1}, {2 2 1})0

({2 3 1 1}, {2 2 3})

Decoding proceeds by :-

If supB≥supA

Emit 0

if supA>1 set s supremum of B to supB-supA+1 else to supB-supA

Else

Emit 1

set s supremum of A to supA-supB.

If supA=supB=1 then end.

({2 3 1 1}, {2 2 3})0

({2 3 1 1}, {2 2 1})1

({2 1 1 1}, {2 2 1})0

({2 1 1 1}, {1 2 1})0

({2 1 1 1}, {1 1 1})1

({1 1 1 1}, {1 1 1})

01001^{R}=10010

This approach to a key is made with certain unsolvsbility results in mind. The aim, of course being to arrive at a situation where it could be claimed that to assert the existence of a successful attack would be to raise a logicsl impossibility.

Post's correspondence Problem involves two lists of equsl length words over 3 fixed alphabet. The entries in each list have a corresponding order (index) fixed at the outset.

The process studied is that of selecting finite sequences of indices to construct corresponding words from the two lists.

For example the sequence 1,4,4,3 yields aabbbbba from list 1 and aababb from list 2.

Lists of a pair are equivalent if there exists at least one finite sequence of indices such that corresponding words are equal.

The sequence 1,4,2 demonstrates equivalence in this case as it yields aabbsb from both lists.

This example can be contrasted with the following instance of

Post's correspondence problem

for which it is easy to prove that no such sequence exists and conclude that the lists are not equivalent.

However in the general case the decision problem for Post's correspondence problem is known to be recursively unsolvable, which is to say that any procedure (algorithm) designed to determine whether lists are equivalent must yield false determinations or fail to terminate. The Word Problem for Semigroups can slso be stated ss s decision problem 3bout equivslence . In this csse we are to examine 3 pair of sequences of symbols from a finite alphabet and a finite l ist of rules for sltering such sequences . The rules take the genersl form of specifying subsequences that can be replaced by some other subsequence .

For example the pair ( 3b, bbb) can be resd ss "sn occurrence of ab can be replaced by bbb and bbb by ab" so aaabbb can be trsnsformed into aabbbbb then into aabbab .

The rules can involve a stronger condition which restricts the replacement of a subsequence to instances where it occurs bounded by other specif ied sequences .

For example (babba, bbbbba ) can be read as "ab can be replaced by bbb if it occurs with b immediately to the left , and bs immedistely to the right . "

Such rules are sometimes referred to as " context sensitive ' . Sequences are equivslent if one can be trsnsformed into the other by a finite number of appl ications of a speci fied list of rules .

The word problem would have solvable status if there was a procedure which accepted arbitrary pairs of sequences and an arbitrary l ist of rules and gave as output a correct decision on the question of equivalence between the sequences under those rules . No such procedure can exist .

The concept of a key has here been trans 1st ed to mean 3 col lection of choices that srise in ful ly describing particular encoding and decoding algorithms from an infinite category. Nothing is exploited that is not novel to the class of algorithms described. Choice can be made available with respect to the following.

1. The binary operations to be employed and

1.1 parameters of each operation chosen.

2. The number of classes of variables.

3. The number of variables in each class and

3.1 initial values of every variable in each class.

4. The length of the vector controlling the order in which variables within 3 class are employed. One vector per class and

4.1 the content of each vector.

5. The order in which the components of the resulting representation are to be transmitted.

The example which follows should clarify these terms.

EXAMPLE 5.1

E 1. Assume that in a list of binary operations provided, the following segment occurs

.

.

a o b = xa+yb

a o b = x(a+b)+y

a o b = xab

.

.

and some subset is selected.

E 1.1 The parameters x and y must now be set for this operation to be fully defined.

E 2. Selecting k classes implies that the source alphabet will be treated as a k-ary fixed length block code during encryption. Say k is set at 3 and the source alphabet one of the DOS code pages . The decimal value for each character will now be treated as a string of ternary digits for the purpose of encryption .

Having set k=3 , three classes of variables must be distinguished. Use V_{1} , V_{2} , V_{3} for these classes .

E 3. Using n(V_{i} ) for the number of variables in the i^{th} class, set n(V_{1} ) = 12 , n(V_{2} ) = 9 , n(V_{3} ) = 14 , n(V_{i} ) > 0 is the only restriction and the choice for one class is independent of that for any other class .

E 3. 1 The initial setting for each variable can be any real number within a range determined by the choice of binary operation. There is no requirement for the initial settings to be equal within or between classes .

E 4. The order in which variables in each class are employed is without restriction and independent of the choice for any other class . Whi le it is possible to generate infinite non- periodic sequences to govern this process it is central to this concept of a key to select a finite length at which a chosen sequence for s class of variables is repeated .

Using O_{i} for a vector that specifies an order for variables in class i^{th} and L( O_{i} ) for the length of that vector .

For example set

L(O_{1}) = 55

L(O_{2}) = 47

L(O_{3}) = 60

and

E 4.1 specify

O_{1} = 1, 3, 7, 12, 7, 9, 9, 3, 2, 3, 2, 11, ....55 terms

O_{2} = 7, 6, 5, 3, 9, 8, 7, 1, ..................................47 terms

O_{3} = 6, 1, 9, 4, 6, 2, 9, 3, 3, 3, 14, ...............60 terms

E 5. The output from the algorithm specified will be 3 sequence of numerical vectors of length Σn(V_{i}),

in this example 12+9+14 = 35.

Now specify 3 vector T of length 35 the entries of which determine the particular variable to which the numerical components in each block of 35 must be assigned to achieve decoding.

Example choose T to begin

V_{1,3} ,V_{2,5} ,V_{3,7} ,V_{2,7} ,................................35 terms

where V_{i,j} is the j_{th} variable in the i_{th} class.

The combinatorial problem faced in an attack by exhaustive enumeration can now be calculated. It is assumed sttsck is made with the following already available.

a) Full knowledge of the class of formal systems and the detail of algorithms constructed from them

b) Full detail of the process by which keys are constructed

c) Some process involving inspection of transmission that can discover the length of the vector T. Contribution to the combinatorics arise as follows.

1. Knowing the total number of variables gives no information about k, the number of classes or the number of variables in each class.

The number of possibilities is

p(∑n(V_{i})

p(x) being the number of partitions of x.

For the example p(35)=14, 883

2. Recalling that 0_{i} is a vector that controls the order in which variables in the i class are employed and L(0_{i}) is its length there, is a contribution of

||[(Σ n(V_{i})!)(n(V_{i})^{L(0)-n(V)})]

from this source.

For this example the terms in the product are of the magnitude

i = 1 .2 x 10^{55}

i = 2 6.6 x 10^{41}

i = 3 4.5 x 10^{63}

and their product is of the magnitude 3.7 x 10^{160}.

3. The number of choices in determining the order of components of T is just

(Σn(V_{i}))!

For the example

35! = 1.03 x 10^{40}

In genersl the contribution from these sources is

P(∑n(V_{i})((∑n(V_{i}))!)(||[(Σ n(V_{i})!)(n(V_{i})^{L(0)-n(V)})]

For the example the result of the order

5.69 x 10^{207}

(For reference 2^{64} = 1.84 x 10^{19})
No attempt has been made to quantify contributions from other freedom in the construction of a key for the ressons thst fol low.

1 . The choice of binary operation is from 3 l ist the contents of which may need to be assumed known for the purpose of analysis of attack . Even a l ist of several thousand would be, by comparison with the contributions above , relatively smal l on that assumption .

2. The choice in setting the parameters of the binary operation chosen and freedom avai lable from setting initisl values for variables is , theoreticslly 3t least , a choice of real numbers in some range and consequently, being not enumerable, defies discrete combinatorial anslysis .

References.

Goldburg, D. (1991). What every computer scientist should know about floating point arithmetic. A C M Comput. Surv ., 23, 1,

5-48.

Higgins, P.M. (1992). Techniques of semigroup theory. Oxford

University Press.

Howson. A.G. (1972). A Handbook of terms used in algebra and anslysis. Csmbridge University Press.

Lsllement, G. (1986). Some algorithms for semigroups and monoids presented by a single relation. Semigroup Theory and

Applications, Oberwolfsch, Lecture Notes in Msth., 1320,

Springer-Verlag, pp.176-82.

Mssters, J.H. (1971). Psychologieal Predictions From Recursive

Function Theory. Doctorsl thesis. University of Sydney.

Motzkin, T. (1949) "The Euclidesn algorithm," Bulletin of the

American Msthematicsl Society, Vol. 55, pages 1142-1146.

Robinson, D. W. (1962). On the generalised inverse of an srbitrsry linesr trsnsformation' s Amer. Msth. Monthly 69, 412-

416.

Schein, B. M.(1963). On the theory of generslised groups.

Dokl. Akad. Nauk SSSR 153, 296-9 (in Russian). Quoted by

Higgins, P.M. 1992 at p15.

Patent Citations

Cited Patent | Filing date | Publication date | Applicant | Title |
---|---|---|---|---|

WO1991019271A1 * | May 16, 1991 | Dec 12, 1991 | Aware, Inc. | Improved method and apparatus for coding an image |

WO1995008873A1 * | Aug 8, 1994 | Mar 30, 1995 | CODEX CORPORATION, a subsidiary company of MOTOROLA, INC. | Data compression method and device utilizing children arrays |

Referenced by

Citing Patent | Filing date | Publication date | Applicant | Title |
---|---|---|---|---|

US5812072 * | Jun 2, 1995 | Sep 22, 1998 | Masters; John | Data conversion technique |

US6493449 * | Feb 26, 1998 | Dec 10, 2002 | Arithmetica, Inc. | Method and apparatus for cryptographically secure algebraic key establishment protocols based on monoids |

Classifications

International Classification | H03M13/00, H03M7/30 |

Cooperative Classification | H03M13/6312, H03M13/00, H03M7/30 |

European Classification | H03M13/63C, H03M7/30, H03M13/00 |

Legal Events

Date | Code | Event | Description |
---|---|---|---|

Dec 14, 1995 | AK | Designated states | Kind code of ref document: A1 Designated state(s): AM AT AU BB BG BR BY CA CH CN CZ DE DK EE ES FI GB GE HU IS JP KE KG KP KR KZ LK LR LT LU LV MD MG MN MW MX NO NZ PL PT RO RU SD SE SG SI SK TJ TM TT UA UG US UZ VN |

Dec 14, 1995 | AL | Designated countries for regional patents | Kind code of ref document: A1 Designated state(s): KE MW SD SZ UG AT BE CH DE DK ES FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN ML MR NE SN TD TG |

Jan 11, 1996 | DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | |

Mar 27, 1996 | 121 | Ep: the epo has been informed by wipo that ep was designated in this application | |

Mar 21, 1997 | WWE | Wipo information: entry into national phase | Ref document number: 08776214 Country of ref document: US |

Mar 27, 1997 | REG | Reference to national code | Ref country code: DE Ref legal event code: 8642 |

Sep 24, 1997 | 122 | Ep: pct application non-entry in european phase | |

Feb 3, 1998 | NENP | Non-entry into the national phase in: | Ref country code: CA |

Rotate