US 6832234 B2 Abstract A method of performing in-place arithmetic, particularly addition and subtraction, on numbers stored in respective consecutive rows of an array processor that has two tags registers. In a first machine cycle per bit, results of logical operations are stored in the tags registers, and the tags registers are shifted to align the intermediate results with other rows. In a second machine cycle per bit, results of further logical operations are stored in the tags registers, and the tags registers are shifted back to align the new intermediate results with the original rows.
Claims(58) 1. Given Q binary numbers a(q), where q is an index between 1 and Q, all of the binary numbers having a common number M of bits indexed by an index m between 1 and M, a method of, for a positive integer P that is less than Q and for all values of q between 1 and P, combining a(q) with a(q+Q−P) to produce M combination bits, comprising the steps of:
(a) providing an array processor that includes an array of content addressable memory (CAM) cells;
(b) storing the binary numbers in respective consecutive rows of said array;
(c) for each value of m between 1 and M: substantially simultaneously, for each value of q between 1 and P: performing at least one first logical operation having a match signal corresponding to the m-th bit of a(q) as an input thereof, thereby producing a first output, and
(d) for each value of m between 1 and M: substantially simultaneously, for each value of q between 1 and P: performing at least one second logical operation having, as inputs thereof, a match signal corresponding to the m-th bit of a(q+Q−P) and said first output, thereby producing a second output.
2. The method of
(A) an AND operation having, as one input thereof, said match signal corresponding to the m-th bit of a(q+Q−P); and
(B) an XOR operation having, as inputs thereof, said first output and an output of said AND operation that has, as one input thereof, said match signal corresponding to the m-th bit of a(q+Q−P).
3. The method of
4. The method of
(e) for each value of m between 1 and M: for each value of q between 1 and P: storing said second output in one of said CAM cells as an m-th combination bit of a(q) and a(q+Q−P).
5. The method of
6. The method of
7. The method of
(e) for each value of m between 1 and M: substantially simultaneously, for each value of q between 1 and P: performing at least one third logical operation having, as inputs thereof, said match signal corresponding to the m-th bit of a(q) and said respective carry bit, thereby providing a third output;
wherein, for each value of m between 1 and M, for each value of q between 1 and P, said at least one second logical operation also has, as an input thereof, said third output; and wherein the method further comprises the step of:
(f) for each value of m between 1 and M: substantially simultaneously, for each value of q between 1 and P: performing at least one fourth logical operation having, as inputs thereof, said match signal corresponding to the m-th bit of a(q+Q−P) and said third output, thereby providing a fourth output.
8. The method of
9. The method of
(i) an AND operation having, as inputs thereof, said match signal corresponding to the m-th bit of a(q+Q−P) and said first output; and
(ii) an OR operation having, as inputs thereof, said first output and an output of said AND operation that has, as inputs thereof, said match signal corresponding to the m-th bit of a(q+Q−P) and said first output.
10. The method of
(i) said at least one third logical operation includes:
(A) a NOT operation having as an input thereof said match signal corresponding to the m-th bit of a(q), and
(B) an AND operation having, as inputs thereof, said carry bit and an output of said NOT operation that has, as an input thereof, said match signal corresponding to the m-th bit of a(q); and
(ii) said at least one fourth logical operation includes:
(A) a NOT operation having as an input thereof said match signal corresponding to the m-th bit of a(q+Q−P),
(B) an AND operation having, as inputs thereof, said first output and an output of said NOT operation that has, as an input thereof, said match signal corresponding to the m-th bit of a(q+Q−P), and
(C) an OR operation having, as inputs thereof, said third output and an output of said AND operation that has, as inputs thereof, said first output and an output of said NOT operation that has, as an input thereof, said match signal corresponding to the m-th bit of a(q+Q−P).
11. The method of
(A) an AND operation having, as inputs thereof, said match signal corresponding to the m-th bit of a(q+Q−P) and said third output; and
(B) an XOR operation having, as inputs thereof, said first output and an output of said AND operation that has, as inputs thereof, said match signal corresponding to the m-th bit of a(q+Q−P) and said third output.
12. The method of
13. The method of
(f) for each value of m between 1 and M, for each value of q between 1 and P: storing said fourth output in one of said CAM cells as an m-th combination bit of a(q) and a(q+Q−P).
14. The method of
15. The method of
16. The method of
17. The method of
18. The method of
19. The method of
20. The method of
21. The method of
22. The method of
23. The method of
24. The method of
25. The method of
(i) a bit stored in said respective first tag register cell immediately prior to said single machine cycle, and
(ii) if said single machine cycle is a compare cycle: an output of a compare operation on said each row;
wherein, for each value of m between 1 and M, for each value of q between 1 and P, said at least one first logical operation having said match signal corresponding to the m-th bit of a(q) as an input thereof is performed using said respective first logic unit of said row wherein a(q) is stored, the method further comprising the steps of: for each value of m between 1 and M, prior to said performing of said second logical operations:
(e) for each value of q between 1 and P: storing said first output in said respective first tag register cell of said row wherein a(q) is stored; and
(f) shifting said first tags register by Q−P, so that, for each value of q between 1 and P, said first output now is stored in said respective first tag register cell of said row wherein a(q+Q−P) is stored;
and wherein, for each value of m between 1 and M, for each value of q between 1 and P, said at least one second operation, that has, as inputs thereof, said match signal corresponding to the m-th bit of a(q+Q−P) and said first output, is performed using said respective first logic unit of said row wherein a(q+Q−P) is stored.
26. The method of
(g) for each value of q between 1 and P: storing said second output in said respective first tag register cell of said row wherein a(q+Q−P) is stored; and
(h) shifting said first tags register by P−Q, so that, for each value of q between 1 and P, said second output now is stored in said respective first tags register cell of said row wherein a(q) is stored.
27. The method of
(i) replacing the m-th bit of a(q) with said second output.
28. The method of
(i) said bit stored in said respective first tag register cell immediately prior to said single machine cycle,
(ii) a bit stored in said respective second tag register cell immediately prior to said single machine cycle, and
(iii) if said single machine cycle is a compare cycle: said output of said compare operation on said each row;
wherein, for each row of said array, said respective first logical unit also is operative to perform, within a single machine cycle, at least one logical operation on input including at least two operands selected from the group consisting of:
(i) said bit stored in said respective first tag register cell immediately prior to said single machine cycle,
(ii) said bit stored in said respective second tag register cell immediately prior to said single machine cycle, and
(iii) if said single machine cycle is a compare cycle: said output of said compare operation on said each row;
wherein, for each value of m between 1 and M, for each value of q between 1 and P, said at least one first logical operation also has a respective carry bit as an input thereof, wherein the method further comprises the step of:
(g) for each value of m between 1 and M:
(i) for each value of q between 1 and P:
(A) performing at least one third logical operation having as inputs thereof said match signal corresponding to the m-th bit of a(q) and said respective carry bit, using said respective second logic unit of said row wherein a(q) is stored, thereby providing a third output, and
(B) storing said third output in said respective second tag register cell of said row wherein a(q) is stored;
(ii) shifting said second tags register by Q−P, so that, for each value of q between 1 and P, said third output now is stored in said respective second tag register cell of said row wherein a(q+Q−P) is stored; and
(iii) performing at least one fourth logical operation having as inputs thereof said match signal corresponding to the m-th bit of a(q+Q−P) and said third output, using said respective second logic unit of said row wherein a(q+Q−P) is stored, thereby providing a fourth output.
29. The method of
(h) for each value of q between 1 and P, storing said fourth output in said respective second tag register cell of said row wherein a(q+Q−P) is stored; and
(i) shifting said second tags register by P−Q, so that, for each value of q between 1 and P, said fourth output now is stored in said respective second tag register cell of said row wherein a(q) is stored.
30. The method of
(j) replacing the m-th bit of a(q) with said fourth output.
31. The method of
32. Given Q binary numbers a(q), where q is an index between 1 and Q, all of the binary numbers having a common number M of bits indexed by an index m between 1 and M, a method of, for a positive integer P that is less than Q and for all values of q between 1 and P, adding a(q) to a(q+Q−P), comprising the steps of:
(a) providing an array processor that includes an array of content addressable memory (CAM) cells;
(b) storing the binary numbers in respective consecutive rows of said array;
(c) for each value of m between 1 and M: substantially simultaneously, for each value of q between 1 and P:
(i) performing at least one first logical operation having, as inputs thereof, a match signal corresponding to the m-th bit of a(q) and a respective carry bit, thereby producing a first output, and
(ii) performing at least one second logical operation having, as inputs thereof, said match signal corresponding to the m-th bit of a(q) and said respective carry bit, thereby producing a second output; and
(d) for each value of m between 1 and M: substantially simultaneously, for each value of q between 1 and Q:
(i) performing at least one third logical operation having, as inputs thereof, a match signal corresponding to the m-th bit of a(q+Q−P) and said first output, thereby producing a third output, and
(ii) performing at least one fourth logical operation having, as inputs thereof, said match signal corresponding to the m-th bit of a(q+Q−P) and said second output, thereby providing a fourth output.
33. The method of
(e) storing said third output in one of said CAM cells as a sum bit.
34. The method of
35. The method of
36. The method of
(e) storing said fourth output in one of said CAM cells as a sum bit.
37. The method of
38. The method of
39. The method of
40. The method of
41. The method of
42. The method of
43. The method of
44. The method of
45. The method of
(i) a bit stored in said respective tag register cell of said first tags register immediately prior to said single machine cycle,
(ii) a bit stored in said respective tag register cell of said second tags register immediately prior to said single machine cycle, and
(iii) if said single machine cycle is a compare cycle: an output of a compare operation on said each row;
wherein, for each value of m between 1 and M, for each value of q between 1 and P:
(i) said at least one first logical operation is effected using said respective logic unit, of said first tags register, that corresponds to said row wherein a(q) is stored,
(ii) said first output is stored in said respective tag register cell, of said first tags register, that corresponds to said row wherein a(q) is stored,
(iii) said at least one second logical operation is effected using said respective logic unit, of said second tags register, that corresponds to said row wherein a(q) is stored, and
(iv) said second output is stored in said respective tag register cell, of said second tags register, that corresponds to said row wherein a(q) is stored;
the method further comprising the step of:
(e) for each value of m between 1 and M, subsequent to said first and second logical operations, shifting said first and second tags registers by Q−P, so that, for each value of q between 1 and P, said first output now is stored in said respective tag register cell, of said first tags register, that corresponds to said row wherein a(q+Q−P) is stored, and said second output now is stored in said respective tag register cell, of said second tags register, that corresponds to said row wherein a(q+Q−P) is stored;
and wherein, for each value of m between 1 and M, for each value of q between 1 and P:
(i) said at least one third logical operation is effected using said respective logic unit, of said first tags register, that corresponds to said row wherein a(q+Q−P) is stored,
(ii) said third output is stored in said respective tag register cell, of said first tags register, that corresponds to said row wherein a(q+Q−P) is stored,
(iii) said at least one fourth logical operation is effected using said respective logic unit, of said second tags register, that corresponds to said row wherein a(q+Q−P) is stored, and
(iv) said fourth output is stored in said respective tag register cell, of said second tags register, that corresponds to said row wherein a(q+Q−P) is stored.
46. The method of
(f) for each value of m between 1 and M, subsequent to said third and fourth logical operations, shifting said first and second tags registers by P−Q, so that, for each value of q between 1 and P, said third output now is stored in said respective tag register cell, of said first tags register, that corresponds to said row wherein a(q) is stored, and said fourth output now is stored in said respective tag register cell, of said second tags register, that corresponds to said row wherein a(q) is stored.
47. Given Q binary numbers a(q), where q is an index between 1 and Q, all of the binary numbers having a common number M of bits indexed by an index m between 1 and M, a method of, for a positive integer P that is less than Q and for all values of q between 1 and P, subtracting a(q+Q−P) from a(q), comprising the steps of:
(a) providing an array processor that includes an array of content addressable memory (CAM) cells;
(b) storing the binary numbers in respective consecutive rows of said array;
(c) for each value of m between 1 and M: substantially simultaneously, for each value of q between 1 and P:
(i) performing at least one first logical operation having, as inputs thereof, a match signal corresponding to the m-th bit of a(q) and a respective carry bit, thereby producing a first output, and
(ii) performing at least one second logical operation having, as inputs thereof, said match signal corresponding to the m-th bit of a(q) and said respective carry bit, thereby producing a second output; and
(d) for each value of m between 1 and M: substantially simultaneously, for each value of q between 1 and Q:
(i) performing at least one third logical operation having, as inputs thereof, a match signal corresponding to the m-th bit of a(q+Q−P) and said first output, thereby producing a third output, and
(ii) performing at least one fourth logical operation having, as inputs thereof, said match signal corresponding to the m-th bit of a(q+Q−P) and said second output, thereby providing a fourth output.
48. The method of
(d) storing said third output in one of said CAM cells as a difference bit.
49. The method of
50. The method of
51. The method of
52. The method of
53. The method of
54. The method of
55. The method of
56. The method of
57. The method of
(i) a bit stored in said respective tag register cell of said first tags register immediately prior to said single machine cycle,
(ii) a bit stored in said respective tag register cell of said second tags register immediately prior to said single machine cycle, and
(iii) if said single machine cycle is a compare cycle: an output of a compare operation on said each row;
wherein, for each value of m between 1 and M, for each value of q between 1 and P:
(i) said at least one first logical operation is effected using said respective logic unit, of said first tags register, that corresponds to said row wherein a(q) is stored,
(ii) said first output is stored in said respective tag register cell, of said first tags register, that corresponds to said row wherein a(q) is stored,
(iii) said at least one second logical operation is effected using said respective logic unit, of said second tags register, that corresponds to said row wherein a(q) is stored, and
(iv) said second output is stored in said respective tag register cell, of said second tags register, that corresponds to said row wherein a(q) is stored;
the method further comprising the step of:
(e) for each value of m between 1 and M, subsequent to said first and second logical operations, shifting said first and second tags registers by Q−P, so that, for each value of q between 1 and P, said first output now is stored in said respective tag register cell, of said first tags register, that corresponds to said row wherein a(q+Q−P) is stored, and said second output now is stored in said respective tag register cell, of said second tags register, that corresponds to said row wherein a(q+Q−P) is stored;
and wherein, for each value of m between 1 and M, for each value of q between 1 and P:
(i) said at least one third logical operation is effected using said respective logic unit, of said first tags register, that corresponds to said row wherein a(q+Q−P) is stored,
(ii) said third output is stored in said respective tag register cell, of said first tags register, that corresponds to said row wherein a(q+Q−P) is stored,
(iii) said at least one fourth logical operation is effected using said respective logic unit, of said second tags register, that corresponds to said row wherein a(q+Q−P) is stored, and
(iv) said fourth output is stored in said respective tag register cell, of said second tags register, that corresponds to said row wherein a(q+Q−P) is stored.
58. The method of
(f) for each value of m between 1 and M, subsequent to said third and fourth logical operations, shifting said first and second tags registers by P−Q, so that, for each value of q between 1 and P, said third output now is stored in said respective tag register cell, of said first tags register, that corresponds to said row wherein a(q) is stored, and said fourth output now is stored in said respective tag register cell, of said second tags register, that corresponds to said row wherein a(q) is stored.
Description This is a continuation in part of U.S. patent application Ser. No. 10/108,451, filed Mar. 29, 2002. The present invention relates to associative processors and, more particularly, to a method of performing arithmetical operations such as addition and subtraction on numbers stored in the associative array of an associative processor. An associative processor is a device for parallel processing of a large volume of data. FIG. 1 is a schematic illustration of an associative processor Each machine cycle of associative processor In the example illustrated in FIG. 1, the fifth through eighth columns Each logic unit In summary, in both kinds of elementary operations, tags register Tags logic blocks An additional function of tags registers More information about associative processors may be found in U.S. Pat. No. 5,974,521, to Akerib, which is incorporated by reference for all purposes as if fully set forth herein. A prior art method of adding a first set of Q binary numbers {a(q), q=1 . . . Q}, stored in a first set of columns http://www.ulib.org/webRoot/Books/Saving_Bell_Books/SBN%20Co mputer%20Strucutres/csp0336.htm. Without loss of generality, all the input numbers {a(q)} and {b(q)} can be assumed to have the same number of bits, because any number that is shorter than the longest input number can be left-padded with 0 bits. For any particular index q, a(q) and b(q) are initially stored in the same row FIG. 2 is a flow chart of the algorithm of Sieworek et al. The input numbers are assumed to be M bits long. The m-th bit of a number a, b or s is designated by a[m], b[m] or s[m]. x refers to a bit stored in the tag register cell The activities of array processor In the initialization step (block The first machine cycle in the loop over m (block The second machine cycle in the loop over m (block The third machine cycle in the loop over m (block The fourth machine cycle in the loop over m (block In block Shain, in U.S. patent application Ser. No. 10/108,451, which is incorporated by reference for all purposes as if fully set forth herein, teaches improved algorithms for addition and subtraction using an associative processor. Unlike the algorithm of Sieworek et al., these algorithms require only three machine cycles per pair of input bits. Shain's algorithms have certain other advantages over the algorithm of Sieworek et al., as explained in U.S. Pat. No. 10/108,451. Nevertheless, all known prior art algorithms require that the numbers being combined be stored initially in separate sets of columns According to the present invention, given Q binary numbers a(q), where q is an index between 1 and Q, all of the binary numbers having a common number M of bits indexed by an index m between 1 and M, there is provided a method of, for a positive integer P that is less than Q and for all values of q between 1 and P, combining a(q) with a(q+Q−P) to produce M combination bits, including the steps of: (a) providing an array processor that includes an array of content addressable memory (CAM) cells; (b) storing the binary numbers in respective consecutive rows of the array; (c) for each value of m between 1 and M: substantially simultaneously, for each value of q between 1 and P: performing at least one first logical operation having a match signal corresponding to the m-th bit of a(q) as an input thereof, thereby producing a first output, and (d) for each value of m between 1 and M: substantially simultaneously, for each value of q between 1 and P: performing at least one second logical operation having, as inputs thereof, a match signal corresponding to the m-th bit of a(q+Q−P) and the first output, thereby producing a second output. According to the present invention, given Q binary numbers a(q), where q is an index between 1 and Q, all of the binary numbers having a common number M of bits indexed by an index m between 1 and M, there is provided a method of, for a positive integer P that is less than Q and for all values of q between 1 and P, adding a(q) to a(q+Q−P), including the steps of: (a) providing an array processor that includes an array of content addressable memory (CAM) cells; (b) storing the binary numbers in respective consecutive rows of the array; (c) for each value of m between 1 and M: substantially simultaneously, for each value of q between 1 and P: (i) performing at least one first logical operation having, as inputs thereof, a match signal corresponding to the m-th bit of a(q) and a respective carry bit, thereby producing a first output, and (ii) performing at least one second logical operation having, as inputs thereof, the match signal corresponding to the m-th bit of a(q) and the respective carry bit, thereby producing a second output; and (d) for each value of m between 1 and M: substantially simultaneously, for each value of q between 1 and Q: (i) performing at least one third logical operation having, as inputs thereof, a match signal corresponding to the m-th bit of a(q+Q−P) and the first output, thereby producing a third output, and (ii) performing at least one fourth logical operation having, as inputs thereof, the match signal corresponding to the m-th bit of a(q+Q−P) and the second output, thereby providing a fourth output. According to the present invention, given Q binary numbers a(q), where q is an index between 1 and Q, all of the binary numbers having a common number M of bits indexed by an index m between 1 and M, there is provided a method of, for a positive integer P that is less than Q and for all values of q between 1 and P, subtracting a(q+Q−P) from a(q), including the steps of: (a) providing an array processor that includes an array of content addressable memory (CAM) cells; (b) storing the binary numbers in respective consecutive rows of the array; and (c) for each value of m between 1 and M: substantially simultaneously, for each value of q between 1 and P: (i) performing at least one first logical operation having, as inputs thereof, a match signal corresponding to the m-th bit of a(q) and a respective carry bit, thereby producing a first output, and (ii) performing at least one second logical operation having, as inputs thereof, the match signal corresponding to the m-th bit of a(q) and the respective carry bit, thereby producing a second output; and (d) for each value of m between 1 and M: substantially simultaneously, for each value of q between 1 and Q: (i) performing at least one third logical operation having, as inputs thereof, a match signal corresponding to the m-th bit of a(q+Q−P) and the first output, thereby producing a third output, and (ii) performing at least one fourth logical operation having, as inputs thereof, the match signal corresponding to the m-th bit of a(q+Q−P) and the second output, thereby providing a fourth output. The present invention is a method of in-place associative processor arithmetic. Given an ordered set of Q binary input numbers a(q), where q is an index that runs from 1 through Q, the numbers a(q) are stored, in order, in consecutive rows Letting m be an index that runs from 1 to M to index the bits of the numbers a(q) from least significant (m=1) to most significant (m=M), the present invention operates on one column At this point, if the logical operations have been those of Shain's improved algorithms, then, depending on the nature of the logical operations performed, the desired combination bits may be found either in tag register cells In the accompanying claims, the set of one or more logical operations, that are executed by logic units The present implementation of Shain's algorithms retains their advantages over the prior art algorithm of Sieworek et al. Specifically: 1. As noted above, Shain's algorithms include only three machine cycles per pair of input bits: two compare cycles and one write cycle. 2. In Shain's algorithms, the bits x that are stored in tag register cells 3. Shain's second addition algorithm includes only five logical operations (two ANDs, two XORs, one OR) per pair of input bits, vs. nine logical operations per pair of input bits in the algorithm of Sieworek et al. Similarly, Shain's subtraction algorithm of the present invention includes only seven logical operations (two ANDs, two XORs, two NOTs, one OR) per pair of input bits, of which only five are binary logical operations. 4. In Shain's second addition algorithm, for each value of q between 1 and P, only three of the logical operations include a(q+Q−P) as a direct or indirect argument, vs. six logical operations in the algorithm of Sieworek et al. Similarly, in Shain's subtraction algorithm, for each value of q between 1 and P, only four of the logical operations include a(q+Q−P) as a direct or indirect argument. 5. Both Shain's second addition algorithm and Shain's subtraction algorithm include OR operations. The algorithm of Sieworek et al. lacks OR operations. 6. In both Shain's second addition algorithm and Shain's subtraction algorithm, there are only two XOR operations per pair of input bits, vs. seven XOR operations in Sieworek's algorithm. 7. In both Shain's second addition algorithm and Shain's subtraction algorithm, only logic units The invention is herein described, by way of example only, with reference to the accompanying drawings, wherein: FIG. 1 is a schematic illustration of an associative processor; FIG. 2 is a flow chart of the prior art addition algorithm of Sieworek et al.; FIG. 3 is a flow chart of Shain's first addition algorithm, as implemented according to the present invention; FIG. 4 is a flow chart of Shain's second addition algorithm, as implemented according to the present invention; FIG. 5 is a flow chart of Shain's subtraction algorithm, as implemented according to the present invention. The present invention is of an in-place method of performing arithmetic using an associative processor. Specifically, the present invention can be used to perform Shain's improved addition and subtraction algorithms in place. The principles and operation of in-place associative processor arithmetic according to the present invention may be better understood with reference to the drawings and the accompanying description. Referring again to the drawings, FIGS. 3, The flow charts of FIGS. 3, Taking these notational differences into account, many of the blocks of FIGS. 3, The activities of array processor Block Block Block Referring now to FIG. 4, the activities of array processor Block Block Block Referring now to FIG. 5, the activities of array processor Block Block Block While the invention has been described with respect to a limited number of embodiments, it will be appreciated that many variations, modifications and other applications of the invention may be made. Patent Citations
Referenced by
Classifications
Legal Events
Rotate |