US 7421563 B2 Abstract A technique for generating a list of all N-bit unsigned binary numbers by starting with an initial number less than some power of 2, successively multiplying the number by that power of 2 and adding the largest non-negative number less than that power of 2 such that the new number is not a duplicate of any of those already generated, and using the resulting lists to generate efficient hashing and serial decoding hardware and software.
Claims(3) 1. A method of serially accessing all data locations of a contiguous linear storage array, using a single address, before repeating an access to a data location, the array being of a size equal to a power of two elements, the method comprising:
beginning with a current address of a location in said array, accessing data stored at the location corresponding to said current address;
generating a next address of a location within said array, said generating comprising doubling said current address and inserting a least-order bit of said data into a least-order bit position of a result of doubling the current address to obtain said next address;
setting said current address equal to said next address; and
iterating said accessing, said generating and said setting until all data locations have been accessed.
2. A method as in
3. A method of defining connections in a serial shift register decoder for an array of size equal to a power of 2 elements, said decoder comprised of at least one circular shift register, the method comprising:
generating an ordered list of all numbers between zero and said size minus one, said generating comprising: beginning with an initial value and successively choosing a power of 2 times said initial value plus an increment, modulo said size of said decoder, as a next number, wherein said increment is the largest number less than said power of 2 such that said next number has not already been chosen, and wherein said initial value is any non-negative number less than said power of 2, and wherein said next number replaces said initial value to generate another next number;
defining address connections to bits of said at least one circular shift register according to low-order bits of the numbers of said ordered list; and
accessing said address connections to bits of said at least one circular shift register.
Description The present invention pertains to decoding a serial stream of address data and traversing all entries of a Hash Table in a non-incremental fashion. Both the structure of the serial decoder and the data used to generate the non-incremental Hash Table addresses are derived from the results of an algorithm, which generates a list of 2N numbers in non-numerical order. Numerous examples of serial memory addressing exist including U.S. Pat. No. 4,667,313, granted May 19, 1984 to Pinkham et. al.; U.S. Pat. No. 5,452,255, granted Sep. 9, 1995 to Mine et al.; and U.S. Pat. No. 6,639,867, granted Oct. 28, 2003, but they all serially load a register with the address, and then use various combinations of the shift register data and traditional decode logic to select the specific word, as opposed to directly serially decoding the address. Serial loading of addresses, as well as serial access of successive memory locations is becoming more important as the asynchronous high bandwidth nature of communications between chips and the cost of integrated circuit I/O coupled with the inherently slower on-chip clocking is resulting in a shift away from parallel synchronous input to high-speed Serialize/De-serialize (SERDES) input. Currently the external serial input is shifted into a register, which is driven in parallel into the chip when the last bit is captured. If the data is an address, it is decoded in parallel in order to access the memory. As a result there is the delay of the parallel decode and word access after the last external bit is available when accessing an on chip memory. Alternatively, in high-speed systems, the address is broken into high- and low-order bits. Presumably the high-order bits are loaded first, and then the low-order bits are loaded. These low-order bits are powered up and drive successive levels of multiplexers to make the data available on the output shortly after the low-order bits are loaded into their register. Reference is now made to Using serial address data to access memories is a typical operation in high-speed serial communications routers and switches, which must be able to translate a large number of addresses into a limited number of port addresses or to indicate the address is not translated to a port in this network. Content-Addressable Memories (CAMs) have occasionally been used to do this type of operation. CAMs have the advantage of completing the search in one operation, but they require both a lot of time per access and a lot of transistors per memory element because they compare the inputs with all the words in their memories on each operation. An alternative is the hardware equivalent of a software technique called hashing. Hashing searches a limited number of items that are stored in a small linear array when the address space for those items is much larger than the array itself by accessing the array with an index address that is both within the range of the array and is an arithmetic function of the item's full address. This implies that many items with different full addresses could map to the same index address. The problem then becomes storing the items efficiently enough in the array to maximize the utilization of the array while finding any given entry in the fewest number of memory accesses. Existing software hashing techniques suggest hashing tables contain a prime number of locations and the index for addressing into the hashing tables maps the large address space evenly into each of the locations. Since there could be multiple items in the table with the same index, only one of which is stored at the index address, it is also necessary to step through the memory in some fashion if the first item accessed is not the correct one. Many hashing algorithms suggest incrementing through the hashing table with another prime that is smaller than the size of the table. Ring theory shows that this will ultimately traverse through all elements in the table without repetition. One way to create an initial index and increment may be:
Now to determine the existence of an item in the table, perform the following:
If the Table(Index) is null, then the item doesn't exist in the table; otherwise it does. In many applications, if it doesn't exist, it is added to the table. Clearly, the table begins with all null values and becomes more inefficient in determining if an item is in the table as the number of entries grows, since clashes exist with multiple items that map to the same hashing index. Traditionally the Increment is a constant 1, which makes the next index easy to create, but often the data put into a hash table clusters about a few numbers, which makes the simple increment inefficient. Two solutions have been proposed in the past to eliminate this problem: either create an index that sufficiently separates otherwise adjacent entries, or create an increment that is both prime and varies with the entry in a way that is sufficiently different from the index to cause entries with the same initial index to have different increments. The difficulty with this approach, when building hardware, is the need for a multiply or divide and the use of prime numbers, which are not easily addressable. Patents such as U.S. Pat. No. 6,804,768, granted Oct. 12, 2004 to Moyer, and U.S. Pat. No. 6,785,278, granted Aug. 31, 2004 to Calvignau, et al., describe ways to create indexes without doing multiplies or divides, but not the increment function. The best solution would be to traverse a space that is N binary address bits, or 2 This disclosure describes a method for accessing all data within a contiguous linear array before repeating by generating each subsequent address using a function of the current address and the data at the current address within the array, where the array is a power of 2 in size, and the function is a single bit shift and low-order bit insertion. More specifically, some portion of the data within the array is generated by sorting a backward-rotated list of the least-order bits of a list of numbers that cover the address space of the array, where the list is generated by beginning with the number zero and successively choosing between double the number, modulo the size of the array, and double the number plus one as the next number, where double the number is chosen if double the number plus one is already in the list. The disclosure also describes a method for generating a ordered list of all numbers between −1 and some power of 2, by beginning with the number zero and successively choosing between double the number and double the number plus one as the next number, wherein double the said number is chosen if double the said number plus one is already in the list; and the least-order bit of said ordered list is used to define connections in a serial shift register decoder. This forms the mathematical basis of the above-described hashing technique. The serial shift register decoder is comprised of a single address input; a multiplicity of word line outputs; and a multiplicity of shift register stages, where each of the shift register stages is connected to a previous and a next shift register stage such that the data shifts through the shift register stages in a circular fashion, the input of half of the shift register stages is the AND of the said address input and the previous shift register stage, and the input of the other half of the shift register stages is the AND of the inverse of the address input and the previous shift register stage, and one of the word line outputs is driven by each shift register stage. In one mode, after serially applying all address bits to the serial decoder at one bit per clock cycle, one of the shift register stages of the serial decoder drives its word line to the opposite state of all the other word lines; and in another mode, after each clock cycle, the shift register stage that drives that one word line transfers its data to the next shift register stage. The invention will now be described in connection with the attached drawings, in which: Decoding is basically translating an address of N bits into 0s on 2N−1 lines and a logic level 1 on the specific line corresponding to the value of the N bits. Reference is now made to A preferred embodiment of the present invention is a serial shift register-based decoder that takes fewer gates than the equivalent traditional decoder structure when decoding more than 4-bit addresses. Reference is now made to Reference is now made to Furthermore, a traditional N-bit decoder built out of 2-input NAND gates has a delay path from its address register that includes driving 2 The key to a properly functioning shift register-based serial decoder is the sign of the Address bit inputted into each stage of the shift register. Each successive address bit must be entered using the same structure, while the resulting decoded address shifts one location around the circular shift register; but in a traditional decoder, each address bit stays in a constant location and drives different AND gates with different polarities, so a simple upper half—lower half decode for each bit doesn't work on a shift register-based serial decoder. What is needed is a way to distribute all possible combinations of +/−A It is mathematically known that a set of all permutations of +/−A
Each column to the right of the gray column is a shift stage, where the bit loaded on that cycle is at the top of the column. In these three examples, the polarities were selected such that the N-bit address is translated into 2 Still, in order for the decode technique to be useful, it is necessary to generate such a structure for 2 This type of decoder generates an ordered list of the 2
Closer inspection of these results shows that each shift corresponds to doubling the previous stage's number and adding a 1 or 0 depending on the polarity of the stage of the shift register, and while there are duplications of numbers prior to N shifts, the results after N shifts are such that each number in each location is unique. The key to generating such a list of polarities should be to create a 2
The first column in each example contains the number of the shift stage. The second column contains an X where a 1 was added to the number in the previous stage, and the third column contains the number for that stage of the shift register. As can be seen in Table 3, all N-bit numbers are represented in the 2 The code for a generation algorithm, which creates a list of 2N elements corresponding to the 2
Reference is now made to Reference is now made to Reference is now made to
The columns labeled Note that as each bit of the 4-bit addresses is entered, the number of shift stages that are still set to a one level is reduced to half of the number on the previous cycle. Eight stages are set to one on cycles This serial decoder is designed for high-speed operation. The shift stages naturally want to be adjacent to each other, and by interleaving them in a fashion shown in Reference is now made to Reference is again made to Reference is now made to Now the delay of the traditional bit select and serial decode can be compared. The worst case delay path for the low-order address bits of the memory in Reference is again made to In yet another embodiment, the serial decoder, when implemented on a memory with its shift address option as shown in In yet another embodiment of the current invention, the technique used to define the polarity of the address bits in the serial decoder may be used in conjunction with a normal memory to create a fast hardware or software hashing algorithm. In general, the best hashing algorithm would be to traverse a space that is N binary address bits, or 2 The basis behind the serial decoder is a double and increment, which can be accomplished by a shift operation and least-order bit insertion. The examples in Table 3 above show full non-incremental traversals of the entire 2
It should be noted here that by successively going to the location specified in the right-most next column, all 16 locations in the table are accessed before repeating. Now, using the following procedure, the list of numbers can be generated:
This algorithm searches through the list of numbers already generated for one plus double the last number. If the number already exists, then double the number is used. It generates a non-repeating ordered list of numbers from 0 to 2 Now the list of numbers can be used to create a set of increments that can be used in the following hashing algorithm:
To create the Increment table, merely do the following:
The Increment table could, for example, be one bit in the hash table. Reference is now made to It should be noted here, that the serial decoder automatically does this operation when moving from decode to serial shift mode. This technique may also be implemented in software, and as shown in
Clearly, the assembly code example above presumes the Hash Table entries were initialized with the Null value concatenated with the Increment value. Since the Increment bit is the sign bit, this code assumes the Items and Results are always positive (including the Null value), because it clears the Increment bit from the Result before comparing Item to it. It also assumes that the Table Address is positive because it puts the Increment bit into the high-order bit of the Index before rotating. It is further understood that a person skilled in the art may further optimize the code for any specific processor, and may modify it to handle Table insertions and/or subsequent searches in a different manner than presented above. Still, the above example does illustrate the technique's ability to traverse the Hash Table in a non-incremental way, without the use of Add or Multiply instructions. In yet another embodiment of the present invention, serial decoders may consist of multiple circular shift register strings. The techniques described above may apply to any number of bits shifted in parallel into the decoder. Reference is now made to
Like the ordered list of numbers shown in Table 3, the next value in the list is determined by shifting the current value and adding the increment, but in this case the number of bits shifted is greater than one (if M>1) and the increment could be as large as 2
The first row of Table 6 shows the Inc values for the 16 word lines. This translates into the polarity lists 0 and 1 applied to the rings in the example, the next two rows in the table. The fourth row in the table is the real decoded address, with the numerical order of the stages shown in the fifth row. Like the simulation shown in Table 4, the rings begin an address decode by being set to all 1s, but in this case the 4 to 16 decode shows the 4-bit address being inputted in two consecutive cycles of two bits each, where A1 on the first cycle is the highest order bit of the address, and A0 on the second cycle is the lowest order bit of the address. This example shows a decode of the number 9, which is located in stage 8. Clearly, this approach takes almost twice the gates of a single-bit decode, but can decode two bits at a time, and the whole address in half as many cycles as a single-bit serial decoder. Furthermore this technique may be generalized as described above to generate a decode comprised of any M circular shift registers each with 2 In yet another embodiment of the present invention, the hashing increment may also be more than one bit. A two-bit increment for a hash code increment can be generated from the Inc values in Table 6 in a similar fashion to generating the single-bit increment values. By rotating the values in the left direction one slot and sorting the resulting numbers by the real decode values, the following Table 7 is generated:
The Inc array shown in the first row of Table 6 is reproduced as the top row of Table 7, along with the real addresses from the fourth row of Table 6 in the second row of Table 7. When the Inc array values are shifted left one, as shown in the third row of Table 7, and both the real addresses and the increments are sorted by the real addresses, and the resulting increment order can be found in the bottom row of the table. Reference is now made to It is also contemplated that the generated polarity lists may also be used in reverse order, or in any other order that meets the requirements to create serial decoders and hashing structures such as described above. It is further contemplated that other serial functions may be generated comprising one or more features of the above serial decoder, or hashing table, such as serial comparators, serial selectors, serially addressed switches, serially addressable polling structures, or other similar structures. It will be appreciated by persons skilled in the art that the present invention is not limited by what has been particularly shown and described hereinabove. Rather the scope of the present invention includes both combinations and sub-combinations of various features described hereinabove as well as modifications and variations which would occur to persons skilled in the art upon reading the foregoing description and which are not in the prior art. Patent Citations
Classifications
Legal Events
Rotate |