WO2010033298A1 - Address generation - Google Patents

Address generation Download PDF

Info

Publication number
WO2010033298A1
WO2010033298A1 PCT/US2009/051224 US2009051224W WO2010033298A1 WO 2010033298 A1 WO2010033298 A1 WO 2010033298A1 US 2009051224 W US2009051224 W US 2009051224W WO 2010033298 A1 WO2010033298 A1 WO 2010033298A1
Authority
WO
WIPO (PCT)
Prior art keywords
sum
address
output
sequence
register
Prior art date
Application number
PCT/US2009/051224
Other languages
French (fr)
Inventor
Colin Stirling
David I. Lawrie
David Andrews
Original Assignee
Xilinx, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xilinx, Inc. filed Critical Xilinx, Inc.
Priority to JP2011527850A priority Critical patent/JP5242796B2/en
Priority to KR1020117008803A priority patent/KR101263152B1/en
Priority to EP09790666.3A priority patent/EP2329362B1/en
Priority to CN200980136779.7A priority patent/CN102160032B/en
Publication of WO2010033298A1 publication Critical patent/WO2010033298A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/34Addressing or accessing the instruction operand or the result ; Formation of operand address; Addressing modes
    • G06F9/355Indexed addressing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/34Addressing or accessing the instruction operand or the result ; Formation of operand address; Addressing modes
    • G06F9/345Addressing or accessing the instruction operand or the result ; Formation of operand address; Addressing modes of multiple operands or results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/34Addressing or accessing the instruction operand or the result ; Formation of operand address; Addressing modes
    • G06F9/345Addressing or accessing the instruction operand or the result ; Formation of operand address; Addressing modes of multiple operands or results
    • G06F9/3455Addressing or accessing the instruction operand or the result ; Formation of operand address; Addressing modes of multiple operands or results using stride
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/34Addressing or accessing the instruction operand or the result ; Formation of operand address; Addressing modes
    • G06F9/355Indexed addressing
    • G06F9/3552Indexed addressing using wraparound, e.g. modulo or circular addressing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3867Concurrent instruction execution, e.g. pipeline, look ahead using instruction pipelines
    • G06F9/3875Pipelining a single stage, e.g. superpipelining
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M13/00Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
    • H03M13/27Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes using interleaving techniques
    • H03M13/2739Permutation polynomial interleaver, e.g. quadratic permutation polynomial [QPP] interleaver and quadratic congruence interleaver

Definitions

  • the invention relates to integrated circuit devices ("ICs"). More particularly, the invention relates to address generation by an IC.
  • PLDs Programmable logic devices
  • FPGA field programmable gate array
  • programmable tiles typically include an array of programmable tiles. These programmable tiles can include, for example, input/output blocks (“lOBs”), configurable logic blocks (“CLBs”), dedicated random access memory blocks (“BRAMs”), multipliers, digital signal processing blocks (“DSPs”), processors, clock managers, delay lock loops (“DLLs”), and so forth.
  • lOBs input/output blocks
  • CLBs configurable logic blocks
  • BRAMs dedicated random access memory blocks
  • DSPs digital signal processing blocks
  • processors processors
  • clock managers delay lock loops
  • DLLs delay lock loops
  • Each programmable tile typically includes both programmable interconnect and programmable logic.
  • the programmable interconnect typically includes a large number of interconnect lines of varying lengths interconnected by programmable interconnect points ("PIPs").
  • PIPs programmable interconnect points
  • the programmable logic implements the logic of a user design using programmable elements that can include, for example, function generators, registers, arithmetic logic, and so forth.
  • the programmable interconnect and programmable logic are typically programmed by loading a stream of configuration data into internal configuration memory cells that define how the programmable elements are configured.
  • the configuration data can be read from memory (e.g., from an external PROM) or written into the FPGA by an external device.
  • the collective states of the individual memory cells then determine the function of the FPGA.
  • a CPLD includes two or more "function blocks” connected together and to input/output ("I/O") resources by an interconnect switch matrix.
  • Each function block of the CPLD includes a two-level AND/OR structure similar to those used in Programmable Logic Arrays ("PLAs”) and Programmable Array Logic (“PAL”) devices.
  • PLAs Programmable Logic Arrays
  • PAL Programmable Array Logic
  • configuration data is typically stored on-chip in non-volatile memory.
  • configuration data is stored on-chip in non-volatile memory, then downloaded to volatile memory as part of an initial configuration (programming) sequence.
  • PLDs programmable logic devices
  • the functionality of the device is controlled by data bits provided to the device for that purpose.
  • the data bits can be stored in volatile memory (e.g., static memory cells, as in
  • FPGAs and some CPLDs in non-volatile memory (e.g., FLASH memory, as in some CPLDs), or in any other type of memory cell.
  • non-volatile memory e.g., FLASH memory, as in some CPLDs
  • PLDs are programmed by applying a processing layer, such as a metal layer, that programmably interconnects the various elements on the device. These PLDs are known as mask programmable devices. PLDs can also be implemented in other ways, e.g., using fuse or antifuse technology.
  • the terms "PLD” and "programmable logic device” include but are not limited to these exemplary devices, as well as encompassing devices that are only partially programmable. For example, one type of PLD includes a combination of hard- coded transistor logic and a programmable switch fabric that programmably interconnects the hard-coded transistor logic.
  • Turbo-channel codes conventionally are used to code data. Turbo codes use data in the order in which it is received and in an interleaved order. Original data is therefore used twice. By turbo-channel codes, it is meant convolutional codes. The data is shuffled using an interleaver, and such interleaver may be part of an encoder, a decoder, or an encoder/decoder ("codec").
  • Data may be interleaved prior to encoding and then deinterleaved for decoding.
  • Some coding, including either or both encoding and decoding, systems have high throughputs achieved through parallel processing.
  • Data is generally interleaved by an encoder and deinterleaved by a decoder. Because decoding is more computationally intensive than encoding, and in order to achieve overall system high throughput, deinterleaving should be capable of being implemented in parallel in the decoder.
  • 3GPP 3 rd Generation Partnership Project
  • QPP quadratic permutation polynomial
  • LTE Long Term Evolution
  • Additional details regarding 3GPP LTE may be found at http://www.3gpp.org.
  • the 3GPP TS 36.212 version 8.3.0 Technical Specification dated May 2008 discloses channel coding, multiplexing, and interleaving in section 5 thereof, particularly sub-sections 5.1.3, 5.1.4.1.1 , and 5.2.2.8 describing a channel interleaver.
  • Using a QPP interleaver allows individual blocks of data to be split into multiple threads and processed in parallel. If multiple independent blocks of data each have their threads processed, then processing such threads of all such data blocks in parallel involves replicating the QPP interleaver. Accordingly, it should be appreciated that the size and performance of an interleaver circuit used to implement a QPP interleaver affects both efficiency of encoding and decoding turbo-channel codes.
  • An embodiment of an address generator comprises a first processing unit, and a second processing unit coupled to receive a stage output from the first processing unit and configured to provide an address output.
  • the stage output is in a first range from -K to -1 for a block size of K, and the address output is in a second range from 0 to K- 1.
  • the address generator can be part of a coding device selected from a group consisting of an encoder, a decoder, and a codec, where the address generator provides the address output for quadratic permutation polynomial interleaving.
  • the address output can include multiple address sequences.
  • the first processing unit and the second processing unit respectively can be initialized with a first initialization value or a second initialization value.
  • the first initialization value can be for a first sequence of the multiple address sequences
  • the second initialization value can be for a second sequence of the multiple address sequences.
  • the address output can be for at least part of an address sequence from 0 to K- 1 ; the first processing unit can be initialized with a first initialization value and a second initialization value; and the second processing unit can be initialized with a third initialization value and a fourth initialization value.
  • the first processing unit can comprise a first adder; a first register, coupled to the first adder; a first multiplexer, coupled to the first register; a first subtractor, coupled to the first multiplexer and the first register; and a second register, coupled to the subtractor, to output the stage output, where the stage output is fed-back to the first adder.
  • the first register can process a first sequence and the second register can simultaneously processes a second sequence.
  • the second processing unit can comprise: a second adder to receive the stage output; a third register, coupled to the second adder; a second multiplexer, coupled to the third register; a third adder, coupled to the second multiplexer and the third register; and a fourth register, coupled to the third adder, to output the address output, where the address output can be fed- back to an input of the second adder.
  • An embodiment of a method to generate addresses comprises: obtaining a step size and a block size; obtaining a first initialization value and a second initialization value; adding the step size to a difference to provide a first sum; subtracting either a null value or the block size from the first sum responsive to a sign bit of the first sum to provide another difference, where the other difference is in a range of -K to -1 for block size of K; registering the first sum or the other difference; and feeding back the other difference in order to add the other difference to the step size.
  • the method can further comprise: generating a second sum by adding the other difference to a third sum; adding either the null value or the block size to the second sum in response to a sign bit of the second sum to provide another third sum, where the other third sum is in a range of 0 to K- 1 ; registering the second sum or the other third sum; and feeding back the other third sum for another iteration of the step for adding to provide the second sum.
  • the registering the first sum or the other difference can include registering the other difference within respective feedback loops for pipelined operation, and where registering the second sum or the other third sum can include registering the other third sum within respective feedback loops for pipelined operation.
  • the registering the first sum or the other difference can include registering the first sum within respective feedback loops for pipelined operation, and where the registering the second sum or the other third sum can include registering the second sum within respective feedback loops for pipelined operation.
  • the step of adding the step size to the difference to provide the first sum can be performed simultaneously with the step of adding to provide the second sum by addition of the other difference to the third sum.
  • the method can further comprise providing the other third sum for quadratic permutation polynomial interleaving.
  • FIG. 1 is a simplified block diagram depicting an exemplary embodiment of a columnar Field Programmable Gate Array (“FPGA”) architecture in which one or more aspects of the invention may be implemented.
  • FPGA Field Programmable Gate Array
  • FIG. 2 is a block diagram depicting an exemplary embodiment of an interleaver.
  • FIG. 3 is a circuit diagram depicting an exemplary embodiment of an address generator of the interleaver of FIG. 2.
  • FIG. 4 is a flow diagram depicting an exemplary embodiment of an address generation flow of the address generator of FIG. 3.
  • FIG. 5 is a pseudo-code listing depicting an exemplary embodiment of an address generation flow.
  • FIG. 1 illustrates an FPGA architecture 100 that includes a large number of different programmable tiles including multi-gigabit transceivers (“MGTs”) 101 , configurable logic blocks (“CLBs”) 102, random access memory blocks (“BRAMs”) 103, input/output blocks (“lOBs”) 104, configuration and clocking logic (“CONFIG/CLOCKS”) 105, digital signal processing blocks (“DSPs”) 106, specialized input/output blocks (“I/O”) 107 (e.g., configuration ports and clock ports), and other programmable logic 108 such as digital clock managers, analog-to-digital converters, system monitoring logic, and so forth.
  • Some FPGAs also include dedicated processor blocks (“PROC”) 110.
  • PROC dedicated processor blocks
  • each programmable tile includes a programmable interconnect element ("INT") 111 having standardized connections to and from a corresponding interconnect element in each adjacent tile. Therefore, the programmable interconnect elements taken together implement the programmable interconnect structure for the illustrated FPGA.
  • the programmable interconnect element 111 also includes the connections to and from the programmable logic element within the same tile, as shown by the examples included at the top of FIG. 1.
  • a CLB 102 can include a configurable logic element (“CLE”) 112 that can be programmed to implement user logic plus a single programmable interconnect element (“INT") 111.
  • a BRAM 103 can include a BRAM logic element (“BRL”) 113 in addition to one or more programmable interconnect elements.
  • BRAM logic element BRAM logic element
  • the number of interconnect elements included in a tile depends on the height of the tile. In the pictured embodiment, a BRAM tile has the same height as five CLBs, but other numbers (e.g., four) can also be used.
  • a DSP tile 106 can include a DSP logic element (“DSPL”) 114 in addition to an appropriate number of programmable interconnect elements.
  • An IOB 104 can include, for example, two instances of an input/output logic element ("IOL") 115 in addition to one instance of the programmable interconnect element 111.
  • IOL input/output logic element
  • the actual I/O pads connected, for example, to the I/O logic element 115 typically are not confined to the area of the input/output logic element 115. In the pictured embodiment, a columnar area near the center of the die
  • FIG. 1 (shown in FIG. 1 ) is used for configuration, clock, and other control logic. Horizontal areas 109 extending from this column are used to distribute the clocks and configuration signals across the breadth of the FPGA.
  • Some FPGAs utilizing the architecture illustrated in FIG. 1 include additional logic blocks that disrupt the regular columnar structure making up a large part of the FPGA.
  • the additional logic blocks can be programmable blocks and/or dedicated logic.
  • processor block 110 spans several columns of CLBs and BRAMs.
  • FIG. 1 is intended to illustrate only an exemplary FPGA architecture.
  • the numbers of logic blocks in a column, the relative width of the columns, the number and order of columns, the types of logic blocks included in the columns, the relative sizes of the logic blocks, and the interconnect/logic implementations included at the top of FIG. 1 are purely exemplary.
  • more than one adjacent column of CLBs is typically included wherever the CLBs appear, to facilitate the efficient implementation of user logic, but the number of adjacent CLB columns varies with the overall size of the FPGA.
  • a QPP interleaver is specified in an LTE 3GPP specification, and such QPP interleaver may be formulated as quadratic equation modulo the block size, K.
  • a direct implementation of the specified QPP interleaving process would involve complex multiplication and complex modulo operations, which are extremely inefficient for implementation in hardware.
  • a more efficient hardware implementation is described in co-pending U.S. Patent Application entitled “Address Generation for Quadratic Permutation Polynomial Interleaving" by Ben J. Jones et al, assigned application number 12/059,731 , filed March 31 , 2008 (Attorney Docket No. X-2726 US) [hereinafter "Jones"].
  • Jones shows and describes how the quadratic formula may be reduced to produce a circuit which may be implemented using adders, subtracters, and selection circuits, such as multiplexers.
  • an even further simplified circuit for address generation for interleaving may be obtained by removing selection operations associated with Jones and reducing the number of adders and subtracters of Jones.
  • reduction of circuitry in turn reduces register count in comparison to Jones, but as shall be appreciated from the following description such simplified address generator has same or comparable performance to that of Jones.
  • Another reduction in comparison to Jones is elimination of registers between first and second stages allowing control logic to be further simplified as initialization values may be applied simultaneously as described below in additional detail.
  • An LTE 3GPP QPP interleaver has an address sequence as defined by:
  • ⁇ (x) f ⁇ x + f 2 x 2 )mod K, where 0 ⁇ x,f v f 2 ⁇ K , (1 ) where fi and h are coefficients of the polynomial, x is an increment in a linear sequence from 0 to K-1 , and K is block size.
  • An x-th interleaved address may be obtained by using Equation (1 ), where fi, and h are fixed coefficients for any integer block size, K. Accordingly, the sequence of addresses for increments of x are from 0 to K-1 in a permutated order for x.
  • Equation (1 ) a first derivation of Equation (1 ) is:
  • Equation (1 ) a second derivation of Equation (1 ) is:
  • n is a skip value which may be any integer value greater than 0.
  • the skip value, n may be used to determine the stride or jump in an interleaved address sequence generated.
  • n when n is set to 1 , a complete sequence of K addresses may be generated; however, if n is set to an integer value larger than 1 then a subset of addresses of a sequence may be generated. For example, if n is set equal to 2, then every other address in a sequence may be generated starting from 0, namely 0, 2, 4,..., K-2. Because the difference between successive terms in Equations (2) and (3) is a linear function and a constant, respectively, the circuit may be implemented using only add, subtract, and select operations, as described below in additional detail, for generating addresses of a sequence. Additionally, for purposes of pipelining multiple sequences, namely multiple threads or streams, where multiple streams are processed with one another, temporary storing operations, such as registering operations, may be added.
  • multiple phases or sequences may be pipelined in a circuit implementation of an address generator to enhance throughput for generating interleaved addresses.
  • pipelining may be used to generate interleaved address sequences for different threads of a single or multiple blocks of data in an alternating manner.
  • sequence start points namely many different starting points for x, and/or skip values, n
  • Initialization values may be predetermined and stored in memory for initialization of address generation for a sequence.
  • FIG. 2 is a block diagram depicting an exemplary embodiment of an interleaver 200.
  • Interleaver 200 may be part of a decoder, an encoder, or a codec. More particularly, interleaver 200 may be associated with convolutional codes, such as turbo-channel codes for QPP interleaving.
  • Block size 201 may be input to storage 210, which may be part of or separate from interleaver 200. Storage 210 may be a look-up table, a random access memory, or other form of storage. Additionally, block size 201 may be input to address generator 220. Another input to storage 210 may be skip value 202. With block size 201 and skip value 202, initialization values 203 and step size 204 may be obtained from storage 210 for providing to address generator 220. Address generator 220 produces addresses 221 to provide one or more sequences of addresses.
  • FIG. 3 is a circuit diagram depicting an embodiment of address generator
  • Address generator 220 includes a first stage address engine 310 and a second stage address engine 320.
  • First stage address engine 310 is an initial stage for address generation and generates a stage output 302.
  • Stage output 302 is provided to second stage address engine 320 for generating at least one sequence of addresses 221.
  • First stage address engine 310 includes adder 311 , subtractor 312, and a select circuit, such as multiplexer 313.
  • first stage address engine 310 includes registers 314 and 315.
  • registers 314 and 315 For a single stream/sequence, only one of register, namely either register 314 or 315, may be implemented within the feedback loop of first stage address engine 310.
  • the setup of registers in first stage engine 310 mirrors that of second stage engine 320 to ensure that the values for a particular stream/sequence are coincident at the input to adder 321 from stage output 302 at the same point in time for iterations.
  • pipelining may be used to enhance throughput.
  • registers 314 and 315 By having at least one each of registers 314 and 315, two sequences of addresses, namely two threads or streams, may be generated together. Furthermore, even though only one of each of registers 314 and 315 is illustratively shown, it should be appreciated that more than one of each of registers 314 and 315 may be implemented. For example, if there were two of each of registers 314 and 315, then as many as four threads or streams of sequences may be generated with pipelined concurrency. It should be understood that streams are generated on alternate clock cycles. Furthermore, edge triggered flip-flops may be used to generate streams on alternate edges. For purposes of clarity by way of example and not limitation, it shall be assumed that there is only one each of registers 314 and 315.
  • initialization values 203 may be obtained from storage 210. These initialization values are indicated as initialization value l(x) 203-1 and initialization value A(x) 203-2.
  • Second stage address engine 320 includes adder 321 , adder 322, and select circuitry, such as multiplexer 323. Additionally, if pipelining is used, second stage address engine 320 may include at least one register 324 and at least one register 325. Again, there may be at least one of registers 324 and 325 or multiples of each of registers 324 and 325 as previously described with reference to registers 314 and 315. Again, however, for purposes of clarity by way of example and not limitation, it shall be assumed that there is one each of registers 324 and 325. At this point, it should be understood that address engines 310 and 320 may be implemented with three adders, one subtractor, and two select circuits.
  • Initialization value l(x) 203-1 is provided as a loadable input to loadable adder 311.
  • output of adder 311 uses initialization value l(x) 203-1 as its initial valid output for a sequence.
  • initialization value A(x) 203-2 which is provided as a loadable input to loadable adder 321 , is used for an initial valid output therefrom.
  • a step size 204 is provided as a data input to adder 311.
  • Another data input to adder 311 is stage output 302, which is provided as a feedback input. Accordingly, step size 204 may be added with initial stage output 302 for output after an initialization value l(x) 203-1 is output from such adder. More particularly, for the exemplary embodiment of FIG. 3, because registers 314 and
  • a first initialization value applied at adder 311 is not fed back as feedback 302 to adder 311 until two clock cycles later when it should have step value 204 added to it (i.e., in the third cycle).
  • an additional initialization value may be applied for a second sequence/stream as supported when there are two registers/pipe-stages within the feedback loop.
  • Output of adder 311 is provided to a data input port of register 314.
  • Output of register 314 is provided to a plus port of subtractor 312.
  • a sign bit, such as a most significant bit (“MSB") 316 is obtained from the output of register 314 as a control select signal of multiplexer 313. It should be appreciated that the MSB output from register 314 is also provided to the plus port of subtractor 312.
  • a logic 0 port of multiplexer 313 is coupled to receive block size 201 , and a logic 1 port of multiplexer 313 is coupled to receive logic Os 330. If MSB bit
  • multiplexer 313 outputs logic Os 330, namely a null value. If, however, MSB bit 316 is a logic 0 indicating output of register 314 is a positive value, then multiplexer 313 outputs block size 201.
  • Output of multiplexer 313 is provided to a minus port of subtractor 312 for subtracting from the data input to a plus port thereof.
  • multiplexer 313 and subtractor 312 in combination may be considered a loadable adder, where the value to be loaded is the candidate to be subtracted from (i.e., connected to plus input port) and the load control bit is the MSB of this value. Accordingly, it should be appreciated that if output from register 314 is positive, subtraction of block size 201 , namely -K, forces output of subtractor 312 to be negative, namely in a range of -K to -1.
  • output of subtractor 312 is in a range of -K to -1 for input to a data port of register 315.
  • Output of register 315 is stage output 302.
  • stage output 302 will be in a range of -K to -1 , for K being block size 201.
  • first stage address engine 310 shifts the range to negative values, namely a move of -K.
  • Stage output 302 from first stage address engine 310 is provided to a data port of adder 321 for addition with an address 221.
  • Address 221 is an address output from register 325 and provided as a feedback address. It should be appreciated that a sequence of addresses 221 is produced from multiple clock cycles during operation. On clock cycles where valid data is output from address generator 220, address 221 constitutes an address output forming part of address sequence.
  • loadable adder 321 may output the sum of a feedback address 221 and a stage output 302. On a next cycle, another initialization value for another sequence, as previously described with reference to loadable adder 311 and not repeated here for purposes of clarity.
  • Output from loadable adder 321 is provided to a data port of register 324.
  • Output of register 324 is provided to a data port of adder 322, and a sign bit, such as an MSB bit 326, output from register 324 is provided as a control select signal to multiplexer 323 as well as being provided to a data port of adder 322.
  • a logic 0 port of multiplexer 323 is coupled to receive logic Os 330, and a logic 1 port of multiplexer 323 is coupled to receive block size 201.
  • MSB bit 326 being a logic 0, namely indicating that output of register 324 is positive, multiplexer 323 selects logic Os 330 for output. If, however, MSB bit 326 is a logic 1 indicating that output of register 324 is a negative value, then multiplexer 323 selects block size 201 for output.
  • Output of multiplexer 323 is provided to a data input port of adder 322.
  • Adder 322 adds the output from register 324 with the output from multiplexer 323. Accordingly, it should be appreciated that output of adder 322 is in a positive range, namely from 0 to K- 1. In other words, by adding K back in address engine 320, the shift or move of values by -K in address engine 310 is effectively neutralized, namely has no net affect on the calculation.
  • Output of register 325 is an address 221 , which is fed back to adder 321 and which is used as part of an address sequence.
  • First stage address engine 310 and second stage address engine 320 may be implemented with respective DSPs 106 and CLBs 102 of FPGA 100 of FIG. 1. Alternatively, only CLBs 102 may be used for implementing engines 310 and 320.
  • By having an address engine stage implemented with one of each of a CLB and a DSP implementing multiple address engines operate in parallel is facilitated, as so few resources are consumed by each address engine stage. In other words, because so few circuit components may be used to provide address generator 220, there are more opportunities for implementing multiple address generators within an FPGA.
  • address engines 310 and 320 are coupled in series and thus have a sequential operation. However, it should be understood that address engines 310 and 320 are operated concurrently for processing a sequence. Thus, for the exemplary embodiment having registers 314, 315, 324, and 325, rather than having a four cycle latency before a valid address 221 , is output as part of an address sequence 321 , there is only a two cycle latency. This is described in additional detail with reference to FIG. 4, where there is shown a flow diagram depicting an exemplary embodiment of an address generation flow 400 of address generator 220 of FIG. 3. Flow 400 is further described with simultaneous reference to FIGS. 3 and 4.
  • block and skip sizes such as block size 201 and skip size 202
  • initialization sizes such as initialization values l(x) 203-1 and A(x) 203-2
  • a step size such as step size 204
  • a sum is generated, such as by adder 311 , as previously described.
  • a sum is generated by adder 321 , as previously described. It should be appreciated that sums generated at 403 and 404 are generated concurrently, namely in parallel.
  • the sum generated at 403 is used in generating a difference, such as by subtractor 312. Again, this difference is in a range of -K to -1.
  • the difference generated at 405 is provided for generating another sum at 404 on a next cycle.
  • a sum is generated, such as by adder 322, using the sum generated at 404. Again, generating of a difference at 405 and generating of a sum at 406 was previously described with reference to FIG. 3, and is not repeated here for purposes of clarity. Again, the range of the sum generated at 406 is from 0 to K- 1. Furthermore, an address may be output at 406, such as address 221.
  • the address output at 406 is fed back to generate another sum at 404, in case the sequence is not completed. Moreover, the difference generated at 405 is fed back to generate another sum at 403, in case the sequence is not completed.
  • a counter (not shown) coupled to receive clock signal 301 may be preset for a linear sequence responsive to a step size 204 and/or a block size 201. However, for an implementation in software, including firmware, a decision may be made. If the sequence is to be incremented, then at 408 the sequence is incremented, namely x, or i as described below, is incremented, for generating other sums at 403 and 404 on a next clock cycle. Accordingly, the sequence of operations may be in hardware, software, or a combination thereof.
  • FIG. 5 is a pseudo-code listing depicting an exemplary embodiment of an address generation flow 500. Values are set and initialized as generally indicated at 501 for loop 502.
  • block size K is equal to 256 for a turbo code and that skip value n is equal to two, namely two phases or two sequences being processed simultaneously, for setting block and skip sizes at 503.
  • the sequences are an odd sequence and an even sequence.
  • x starts at 0, and for the odd sequence x starts at 1.
  • initialization value (“A_cand[x]”) 203-2(even) is Equation (1 ) with x equal to 0.
  • initialization value (“l_cand[x]”) 203-1 (even) is Equation (2) with x set equal to 0. It should be appreciated that both initialization values 203-1 and 203- 2 for an even sequence reduce to respective constants, as coefficients fi and h are constants.
  • Step size 204 is not dependent on x as indicated in Equation (3), and thus step size ("s") 204 is a constant value.
  • step size (“s") 204 is a constant value.
  • initialization address candidate (“A_cand[x]”) and increment candidate (“l_cand[x]”) progress for each increase in x.
  • x is of the sequence 0, 2, 4,...,K-2
  • x has a progression of 1 , 3, 5,...,K-1 , for this exemplary embodiment.
  • An address candidate is positive on a first iteration for a sequence, so it may be output directly. Furthermore, an increment candidate is positive on a first iteration for a sequence, so has a block size subtracted therefrom.
  • the first address value output for the even sequence is initialization value 203-2(even), namely 0, and the initial stage output for such first iteration is initialization value 203-1 (even) minus K.
  • first iteration it should be understood that there may be some cycle latency as previously described, and thus the first iteration means the first valid output.
  • step size 204 is a constant which may be initialized as it depends only on skip value n for both odd and even phases. In other words, both odd and even phases have the same step sizes. It is not necessary that skip value be set for n equal to 2.
  • skip value n may be set equal to 1.
  • a block size of K equal to 256 is described for purposes of clarity by way of example and not limitation, it should be understood that block sizes greater than or less than 256 may be used.
  • a fixed block size is used for this example for purposes of clarity, it should be appreciated that a variable block size may be used. Thus, it is not necessary to use an odd and even sequence or even to alternate among multiple sequences using skip value.
  • skip value may be set to some fraction of the block size. It is not necessary for the linear sequence to progress all the way from 0 through to K- 1 , but some fraction of a sequence may be processed. However, for purposes of clarity by way of example and not limitation, it shall be assumed that the entire sequence from 0 to K- 1 is processed in loop 502.
  • x may be reinitialized at a fraction of the block size.
  • an increment i is set as going from 0 to K- 1 for loop 502. If the address candidate is negative, then the block size K is added to the address candidate as indicated at 512. If the increment candidate is positive, then block size K is subtracted at indicated at 513. At 514, the next address candidate for a then current phase is calculated.
  • loop 502 in this example is for i from 0 to K- 1 in increments of one, and when i is equal to K- 1 after 516, then loop 502 ends at 517.
  • address generation flow 500 has been described for multiple threads or sequences, it should be understood that such flow may be reduced down for a single sequence, in which case only one set of address and increment candidates would be obtained. Furthermore, it should be understood that more than two sets of address and increment candidates may be incremented for more than two threads or phases.
  • initialization may take place before any register in each engine whereas the above description assumes initialization using the logic located in front of or just before an initial register of each engine.
  • the exemplary embodiments just happen to show initialization in loadable adders 311 and 321 before registers 314 and 324, respectively, of FIG. 3. Initialization was assumed to be in adders 311 and 321 because these adders are less complex as they do not involve respective multiplexers.
  • initialization may take place by at a loadable subtractor 312 and a loadable adder 322. Or both streams may be initialized at once rather than sequentially. So the difference from subtractor 312 and the sum from adder 322 may be initialized for a first sequence at the same time as the sum from adder 311 and the sum from adder 321 are initialized for a second sequence. Also when extra registers are inserted to allow for one or more extra streams, there may be no logic in front of such registers, and thus such registers may be used for initialization.
  • first and second streams/sequences may be completely independent of one another and each may be started at any point in a block though both may not have a same starting point.
  • the first steam/sequence does not necessarily have to be initialized before or after the second stream/sequence.
  • the third initialization value corresponds to the same stream/sequence as the first initialization value, and where the third initialization value initializes the second processing engine, the first initialization value may be used to initialize the first processing engine for the same stream/sequence with a specific start location between 0 and K-1
  • the second initialization value and the fourth initialization value may correspond to the same stream/sequence.

Abstract

Address generation by an integrated circuit (100) is described. An aspect relates generally to an address generator (220) which has first and second processing units (310, 320). The second processing unit (320) is coupled to receive a stage output from the first processing unit (310) and configured to provide an address output. The stage output is in a first range, and the address output is in a second range. The first range is from -K to -1 for a block size of K, and the second range is from 0 to K- 1.

Description

ADDRESS GENERATION
FIELD OF THE INVENTION The invention relates to integrated circuit devices ("ICs"). More particularly, the invention relates to address generation by an IC.
BACKGROUND OF THE INVENTION
Programmable logic devices ("PLDs") are a well-known type of integrated circuit that can be programmed to perform specified logic functions. One type of PLD, the field programmable gate array ("FPGA"), typically includes an array of programmable tiles. These programmable tiles can include, for example, input/output blocks ("lOBs"), configurable logic blocks ("CLBs"), dedicated random access memory blocks ("BRAMs"), multipliers, digital signal processing blocks ("DSPs"), processors, clock managers, delay lock loops ("DLLs"), and so forth. As used herein, "include" and "including" mean including without limitation.
Each programmable tile typically includes both programmable interconnect and programmable logic. The programmable interconnect typically includes a large number of interconnect lines of varying lengths interconnected by programmable interconnect points ("PIPs"). The programmable logic implements the logic of a user design using programmable elements that can include, for example, function generators, registers, arithmetic logic, and so forth.
The programmable interconnect and programmable logic are typically programmed by loading a stream of configuration data into internal configuration memory cells that define how the programmable elements are configured. The configuration data can be read from memory (e.g., from an external PROM) or written into the FPGA by an external device. The collective states of the individual memory cells then determine the function of the FPGA.
Another type of PLD is the Complex Programmable Logic Device, or CPLD. A CPLD includes two or more "function blocks" connected together and to input/output ("I/O") resources by an interconnect switch matrix. Each function block of the CPLD includes a two-level AND/OR structure similar to those used in Programmable Logic Arrays ("PLAs") and Programmable Array Logic ("PAL") devices. In CPLDs, configuration data is typically stored on-chip in non-volatile memory. In some CPLDs, configuration data is stored on-chip in non-volatile memory, then downloaded to volatile memory as part of an initial configuration (programming) sequence.
For all of these programmable logic devices ("PLDs"), the functionality of the device is controlled by data bits provided to the device for that purpose. The data bits can be stored in volatile memory (e.g., static memory cells, as in
FPGAs and some CPLDs), in non-volatile memory (e.g., FLASH memory, as in some CPLDs), or in any other type of memory cell.
Other PLDs are programmed by applying a processing layer, such as a metal layer, that programmably interconnects the various elements on the device. These PLDs are known as mask programmable devices. PLDs can also be implemented in other ways, e.g., using fuse or antifuse technology. The terms "PLD" and "programmable logic device" include but are not limited to these exemplary devices, as well as encompassing devices that are only partially programmable. For example, one type of PLD includes a combination of hard- coded transistor logic and a programmable switch fabric that programmably interconnects the hard-coded transistor logic.
Turbo-channel codes conventionally are used to code data. Turbo codes use data in the order in which it is received and in an interleaved order. Original data is therefore used twice. By turbo-channel codes, it is meant convolutional codes. The data is shuffled using an interleaver, and such interleaver may be part of an encoder, a decoder, or an encoder/decoder ("codec").
Data may be interleaved prior to encoding and then deinterleaved for decoding. In some coding, including either or both encoding and decoding, systems, have high throughputs achieved through parallel processing. Data is generally interleaved by an encoder and deinterleaved by a decoder. Because decoding is more computationally intensive than encoding, and in order to achieve overall system high throughput, deinterleaving should be capable of being implemented in parallel in the decoder.
In the 3rd Generation Partnership Project ("3GPP"), a quadratic permutation polynomial ("QPP") interleaver is called out in the proposed Long Term Evolution ("LTE") 3GPP specification to facilitate contention-free addressing. Additional details regarding 3GPP LTE may be found at http://www.3gpp.org. In particular, the 3GPP TS 36.212 version 8.3.0 Technical Specification dated May 2008 discloses channel coding, multiplexing, and interleaving in section 5 thereof, particularly sub-sections 5.1.3, 5.1.4.1.1 , and 5.2.2.8 describing a channel interleaver.
Using a QPP interleaver allows individual blocks of data to be split into multiple threads and processed in parallel. If multiple independent blocks of data each have their threads processed, then processing such threads of all such data blocks in parallel involves replicating the QPP interleaver. Accordingly, it should be appreciated that the size and performance of an interleaver circuit used to implement a QPP interleaver affects both efficiency of encoding and decoding turbo-channel codes.
SUMMARY OF THE INVENTION
An embodiment of an address generator comprises a first processing unit, and a second processing unit coupled to receive a stage output from the first processing unit and configured to provide an address output. The stage output is in a first range from -K to -1 for a block size of K, and the address output is in a second range from 0 to K- 1.
In this embodiment, the address generator can be part of a coding device selected from a group consisting of an encoder, a decoder, and a codec, where the address generator provides the address output for quadratic permutation polynomial interleaving. The address output can include multiple address sequences. The first processing unit and the second processing unit respectively can be initialized with a first initialization value or a second initialization value. The first initialization value can be for a first sequence of the multiple address sequences, and the second initialization value can be for a second sequence of the multiple address sequences. The address output can be for at least part of an address sequence from 0 to K- 1 ; the first processing unit can be initialized with a first initialization value and a second initialization value; and the second processing unit can be initialized with a third initialization value and a fourth initialization value. In this embodiment, the first processing unit can comprise a first adder; a first register, coupled to the first adder; a first multiplexer, coupled to the first register; a first subtractor, coupled to the first multiplexer and the first register; and a second register, coupled to the subtractor, to output the stage output, where the stage output is fed-back to the first adder. The first register can process a first sequence and the second register can simultaneously processes a second sequence. The second processing unit can comprise: a second adder to receive the stage output; a third register, coupled to the second adder; a second multiplexer, coupled to the third register; a third adder, coupled to the second multiplexer and the third register; and a fourth register, coupled to the third adder, to output the address output, where the address output can be fed- back to an input of the second adder.
An embodiment of a method to generate addresses comprises: obtaining a step size and a block size; obtaining a first initialization value and a second initialization value; adding the step size to a difference to provide a first sum; subtracting either a null value or the block size from the first sum responsive to a sign bit of the first sum to provide another difference, where the other difference is in a range of -K to -1 for block size of K; registering the first sum or the other difference; and feeding back the other difference in order to add the other difference to the step size.
In this embodiment, the method can further comprise: generating a second sum by adding the other difference to a third sum; adding either the null value or the block size to the second sum in response to a sign bit of the second sum to provide another third sum, where the other third sum is in a range of 0 to K- 1 ; registering the second sum or the other third sum; and feeding back the other third sum for another iteration of the step for adding to provide the second sum. The registering the first sum or the other difference can include registering the other difference within respective feedback loops for pipelined operation, and where registering the second sum or the other third sum can include registering the other third sum within respective feedback loops for pipelined operation. The registering the first sum or the other difference can include registering the first sum within respective feedback loops for pipelined operation, and where the registering the second sum or the other third sum can include registering the second sum within respective feedback loops for pipelined operation. The step of adding the step size to the difference to provide the first sum can be performed simultaneously with the step of adding to provide the second sum by addition of the other difference to the third sum. The method can further comprise providing the other third sum for quadratic permutation polynomial interleaving. BRIEF DESCRIPTION OF THE DRAWINGS
Accompanying drawing(s) show exemplary embodiment(s) in accordance with one or more aspects of the invention; however, the accompanying drawing(s) should not be taken to limit the invention to the embodiment(s) shown, but are for explanation and understanding only.
FIG. 1 is a simplified block diagram depicting an exemplary embodiment of a columnar Field Programmable Gate Array ("FPGA") architecture in which one or more aspects of the invention may be implemented.
FIG. 2 is a block diagram depicting an exemplary embodiment of an interleaver.
FIG. 3 is a circuit diagram depicting an exemplary embodiment of an address generator of the interleaver of FIG. 2.
FIG. 4 is a flow diagram depicting an exemplary embodiment of an address generation flow of the address generator of FIG. 3. FIG. 5 is a pseudo-code listing depicting an exemplary embodiment of an address generation flow.
DETAILED DESCRIPTION
In the following description, numerous specific details are set forth to provide a more thorough description of the specific embodiments of the invention. It should be apparent, however, to one skilled in the art, that the invention may be practiced without all the specific details given below. In other instances, well known features have not been described in detail so as not to obscure the invention. For ease of illustration, the same number labels are used in different diagrams to refer to the same items; however, in alternative embodiments the items may be different.
As noted above, advanced FPGAs can include several different types of programmable logic blocks in the array. For example, FIG. 1 illustrates an FPGA architecture 100 that includes a large number of different programmable tiles including multi-gigabit transceivers ("MGTs") 101 , configurable logic blocks ("CLBs") 102, random access memory blocks ("BRAMs") 103, input/output blocks ("lOBs") 104, configuration and clocking logic ("CONFIG/CLOCKS") 105, digital signal processing blocks ("DSPs") 106, specialized input/output blocks ("I/O") 107 (e.g., configuration ports and clock ports), and other programmable logic 108 such as digital clock managers, analog-to-digital converters, system monitoring logic, and so forth. Some FPGAs also include dedicated processor blocks ("PROC") 110.
In some FPGAs, each programmable tile includes a programmable interconnect element ("INT") 111 having standardized connections to and from a corresponding interconnect element in each adjacent tile. Therefore, the programmable interconnect elements taken together implement the programmable interconnect structure for the illustrated FPGA. The programmable interconnect element 111 also includes the connections to and from the programmable logic element within the same tile, as shown by the examples included at the top of FIG. 1.
For example, a CLB 102 can include a configurable logic element ("CLE") 112 that can be programmed to implement user logic plus a single programmable interconnect element ("INT") 111. A BRAM 103 can include a BRAM logic element ("BRL") 113 in addition to one or more programmable interconnect elements. Typically, the number of interconnect elements included in a tile depends on the height of the tile. In the pictured embodiment, a BRAM tile has the same height as five CLBs, but other numbers (e.g., four) can also be used. A DSP tile 106 can include a DSP logic element ("DSPL") 114 in addition to an appropriate number of programmable interconnect elements. An IOB 104 can include, for example, two instances of an input/output logic element ("IOL") 115 in addition to one instance of the programmable interconnect element 111. As will be clear to those of skill in the art, the actual I/O pads connected, for example, to the I/O logic element 115 typically are not confined to the area of the input/output logic element 115. In the pictured embodiment, a columnar area near the center of the die
(shown in FIG. 1 ) is used for configuration, clock, and other control logic. Horizontal areas 109 extending from this column are used to distribute the clocks and configuration signals across the breadth of the FPGA.
Some FPGAs utilizing the architecture illustrated in FIG. 1 include additional logic blocks that disrupt the regular columnar structure making up a large part of the FPGA. The additional logic blocks can be programmable blocks and/or dedicated logic. For example, processor block 110 spans several columns of CLBs and BRAMs.
Note that FIG. 1 is intended to illustrate only an exemplary FPGA architecture. For example, the numbers of logic blocks in a column, the relative width of the columns, the number and order of columns, the types of logic blocks included in the columns, the relative sizes of the logic blocks, and the interconnect/logic implementations included at the top of FIG. 1 are purely exemplary. For example, in an actual FPGA more than one adjacent column of CLBs is typically included wherever the CLBs appear, to facilitate the efficient implementation of user logic, but the number of adjacent CLB columns varies with the overall size of the FPGA.
As previously described, a QPP interleaver is specified in an LTE 3GPP specification, and such QPP interleaver may be formulated as quadratic equation modulo the block size, K. A direct implementation of the specified QPP interleaving process would involve complex multiplication and complex modulo operations, which are extremely inefficient for implementation in hardware. A more efficient hardware implementation is described in co-pending U.S. Patent Application entitled "Address Generation for Quadratic Permutation Polynomial Interleaving" by Ben J. Jones et al, assigned application number 12/059,731 , filed March 31 , 2008 (Attorney Docket No. X-2726 US) [hereinafter "Jones"]. Jones shows and describes how the quadratic formula may be reduced to produce a circuit which may be implemented using adders, subtracters, and selection circuits, such as multiplexers. As described below in additional detail, an even further simplified circuit for address generation for interleaving may be obtained by removing selection operations associated with Jones and reducing the number of adders and subtracters of Jones. Furthermore, such reduction of circuitry in turn reduces register count in comparison to Jones, but as shall be appreciated from the following description such simplified address generator has same or comparable performance to that of Jones. Another reduction in comparison to Jones is elimination of registers between first and second stages allowing control logic to be further simplified as initialization values may be applied simultaneously as described below in additional detail.
Even though the following description is in terms of an LTE 3GPP QPP interleaver and address sequence therefor, it should be appreciated that other address sequences may be used. An LTE 3GPP QPP interleaver has an address sequence as defined by:
π(x) = føx + f2x2)mod K, where 0 ≤ x,fvf2 < K , (1 ) where fi and h are coefficients of the polynomial, x is an increment in a linear sequence from 0 to K-1 , and K is block size. An x-th interleaved address may be obtained by using Equation (1 ), where fi, and h are fixed coefficients for any integer block size, K. Accordingly, the sequence of addresses for increments of x are from 0 to K-1 in a permutated order for x. It should be understood that even though a sequence is described as going from 0 to K-1 , it should be appreciated that a sequence need not start at 0 and need not go all the way to K-1 , namely it need not step through each linear increment of the sequence for all K increments. Furthermore, there may be skip value for skipping linear increments for generating a sequence. Again, it should be appreciated that a block of data may be broken out into multiple threads or streams for processing in parallel as described below in additional detail.
As indicated in Jones, a first derivation of Equation (1 ) is:
πr(x) = [f2(2nx + n2)+f,n]moό K, (2)
and a second derivation of Equation (1 ) is:
ir(x) = [2n2f2]mod K . (3)
In Equations (2) and (3), n is a skip value which may be any integer value greater than 0. Thus, for example, if n is equal to 1 , there is no skipping and each linear increment of a sequence, 0, 1 , 2,..., to some number which may be as large as K-1 , is processed in order to provide at most K interleaved addresses for such sequence. Thus, the skip value, n, may be used to determine the stride or jump in an interleaved address sequence generated.
Again, when n is set to 1 , a complete sequence of K addresses may be generated; however, if n is set to an integer value larger than 1 then a subset of addresses of a sequence may be generated. For example, if n is set equal to 2, then every other address in a sequence may be generated starting from 0, namely 0, 2, 4,..., K-2. Because the difference between successive terms in Equations (2) and (3) is a linear function and a constant, respectively, the circuit may be implemented using only add, subtract, and select operations, as described below in additional detail, for generating addresses of a sequence. Additionally, for purposes of pipelining multiple sequences, namely multiple threads or streams, where multiple streams are processed with one another, temporary storing operations, such as registering operations, may be added. Thus, as should be appreciated from the following description, multiple phases or sequences may be pipelined in a circuit implementation of an address generator to enhance throughput for generating interleaved addresses. Alternatively, depending on the parallel nature of turbo-code processing blocks, pipelining may be used to generate interleaved address sequences for different threads of a single or multiple blocks of data in an alternating manner. Thus it should be appreciated that many different sequence start points, namely many different starting points for x, and/or skip values, n, may be supported for a variety of data blocks. Initialization values may be predetermined and stored in memory for initialization of address generation for a sequence.
FIG. 2 is a block diagram depicting an exemplary embodiment of an interleaver 200. Interleaver 200 may be part of a decoder, an encoder, or a codec. More particularly, interleaver 200 may be associated with convolutional codes, such as turbo-channel codes for QPP interleaving. Block size 201 may be input to storage 210, which may be part of or separate from interleaver 200. Storage 210 may be a look-up table, a random access memory, or other form of storage. Additionally, block size 201 may be input to address generator 220. Another input to storage 210 may be skip value 202. With block size 201 and skip value 202, initialization values 203 and step size 204 may be obtained from storage 210 for providing to address generator 220. Address generator 220 produces addresses 221 to provide one or more sequences of addresses. FIG. 3 is a circuit diagram depicting an embodiment of address generator
220 of FIG. 2. Address generator 220 includes a first stage address engine 310 and a second stage address engine 320. First stage address engine 310 is an initial stage for address generation and generates a stage output 302. Stage output 302 is provided to second stage address engine 320 for generating at least one sequence of addresses 221.
First stage address engine 310 includes adder 311 , subtractor 312, and a select circuit, such as multiplexer 313. For this exemplary embodiment, first stage address engine 310 includes registers 314 and 315. For a single stream/sequence, only one of register, namely either register 314 or 315, may be implemented within the feedback loop of first stage address engine 310. The setup of registers in first stage engine 310 mirrors that of second stage engine 320 to ensure that the values for a particular stream/sequence are coincident at the input to adder 321 from stage output 302 at the same point in time for iterations. However, pipelining may be used to enhance throughput. Additionally, by having at least one each of registers 314 and 315, two sequences of addresses, namely two threads or streams, may be generated together. Furthermore, even though only one of each of registers 314 and 315 is illustratively shown, it should be appreciated that more than one of each of registers 314 and 315 may be implemented. For example, if there were two of each of registers 314 and 315, then as many as four threads or streams of sequences may be generated with pipelined concurrency. It should be understood that streams are generated on alternate clock cycles. Furthermore, edge triggered flip-flops may be used to generate streams on alternate edges. For purposes of clarity by way of example and not limitation, it shall be assumed that there is only one each of registers 314 and 315.
As previously described, initialization values 203 may be obtained from storage 210. These initialization values are indicated as initialization value l(x) 203-1 and initialization value A(x) 203-2.
Second stage address engine 320 includes adder 321 , adder 322, and select circuitry, such as multiplexer 323. Additionally, if pipelining is used, second stage address engine 320 may include at least one register 324 and at least one register 325. Again, there may be at least one of registers 324 and 325 or multiples of each of registers 324 and 325 as previously described with reference to registers 314 and 315. Again, however, for purposes of clarity by way of example and not limitation, it shall be assumed that there is one each of registers 324 and 325. At this point, it should be understood that address engines 310 and 320 may be implemented with three adders, one subtractor, and two select circuits.
Initialization value l(x) 203-1 is provided as a loadable input to loadable adder 311. On an initial clock cycle of clock signal 301 , which is provided to a clock portion of each of registers 314, 315, 324, and 325, output of adder 311 uses initialization value l(x) 203-1 as its initial valid output for a sequence. Likewise, for an initial cycle of a sequence, initialization value A(x) 203-2, which is provided as a loadable input to loadable adder 321 , is used for an initial valid output therefrom. A step size 204 is provided as a data input to adder 311. Another data input to adder 311 is stage output 302, which is provided as a feedback input. Accordingly, step size 204 may be added with initial stage output 302 for output after an initialization value l(x) 203-1 is output from such adder. More particularly, for the exemplary embodiment of FIG. 3, because registers 314 and
315 are present, a first initialization value applied at adder 311 is not fed back as feedback 302 to adder 311 until two clock cycles later when it should have step value 204 added to it (i.e., in the third cycle). On the second cycle, an additional initialization value may be applied for a second sequence/stream as supported when there are two registers/pipe-stages within the feedback loop.
Output of adder 311 is provided to a data input port of register 314. Output of register 314 is provided to a plus port of subtractor 312. Additionally, a sign bit, such as a most significant bit ("MSB") 316 is obtained from the output of register 314 as a control select signal of multiplexer 313. It should be appreciated that the MSB output from register 314 is also provided to the plus port of subtractor 312.
A logic 0 port of multiplexer 313 is coupled to receive block size 201 , and a logic 1 port of multiplexer 313 is coupled to receive logic Os 330. If MSB bit
316 is a logic 1 indicating a negative value, then multiplexer 313 outputs logic Os 330, namely a null value. If, however, MSB bit 316 is a logic 0 indicating output of register 314 is a positive value, then multiplexer 313 outputs block size 201.
Output of multiplexer 313 is provided to a minus port of subtractor 312 for subtracting from the data input to a plus port thereof. Alternatively, multiplexer 313 and subtractor 312 in combination may be considered a loadable adder, where the value to be loaded is the candidate to be subtracted from (i.e., connected to plus input port) and the load control bit is the MSB of this value. Accordingly, it should be appreciated that if output from register 314 is positive, subtraction of block size 201 , namely -K, forces output of subtractor 312 to be negative, namely in a range of -K to -1. If output of register 314 is already negative, adding logic Os 330 to such output has no affect, and thus output of subtractor 312 is the negative output of register 314. Accordingly, output of subtractor 312 is in a range of -K to -1 for input to a data port of register 315. Output of register 315 is stage output 302. Thus, stage output 302 will be in a range of -K to -1 , for K being block size 201. Thus, first stage address engine 310 shifts the range to negative values, namely a move of -K. Stage output 302 from first stage address engine 310 is provided to a data port of adder 321 for addition with an address 221. Address 221 is an address output from register 325 and provided as a feedback address. It should be appreciated that a sequence of addresses 221 is produced from multiple clock cycles during operation. On clock cycles where valid data is output from address generator 220, address 221 constitutes an address output forming part of address sequence.
After outputting an initial initialization value A(x) 203-2, loadable adder 321 may output the sum of a feedback address 221 and a stage output 302. On a next cycle, another initialization value for another sequence, as previously described with reference to loadable adder 311 and not repeated here for purposes of clarity. Output from loadable adder 321 is provided to a data port of register 324. Output of register 324 is provided to a data port of adder 322, and a sign bit, such as an MSB bit 326, output from register 324 is provided as a control select signal to multiplexer 323 as well as being provided to a data port of adder 322.
A logic 0 port of multiplexer 323 is coupled to receive logic Os 330, and a logic 1 port of multiplexer 323 is coupled to receive block size 201. For MSB bit 326 being a logic 0, namely indicating that output of register 324 is positive, multiplexer 323 selects logic Os 330 for output. If, however, MSB bit 326 is a logic 1 indicating that output of register 324 is a negative value, then multiplexer 323 selects block size 201 for output.
Output of multiplexer 323 is provided to a data input port of adder 322. Adder 322 adds the output from register 324 with the output from multiplexer 323. Accordingly, it should be appreciated that output of adder 322 is in a positive range, namely from 0 to K- 1. In other words, by adding K back in address engine 320, the shift or move of values by -K in address engine 310 is effectively neutralized, namely has no net affect on the calculation.
Output of adder 322, which is in a range of 0 to K- 1 , is provided to data input port of register 325. Output of register 325 is an address 221 , which is fed back to adder 321 and which is used as part of an address sequence.
First stage address engine 310 and second stage address engine 320 may be implemented with respective DSPs 106 and CLBs 102 of FPGA 100 of FIG. 1. Alternatively, only CLBs 102 may be used for implementing engines 310 and 320. By having an address engine stage implemented with one of each of a CLB and a DSP implementing multiple address engines operate in parallel is facilitated, as so few resources are consumed by each address engine stage. In other words, because so few circuit components may be used to provide address generator 220, there are more opportunities for implementing multiple address generators within an FPGA.
In the exemplary embodiment of FIG. 3, address engines 310 and 320 are coupled in series and thus have a sequential operation. However, it should be understood that address engines 310 and 320 are operated concurrently for processing a sequence. Thus, for the exemplary embodiment having registers 314, 315, 324, and 325, rather than having a four cycle latency before a valid address 221 , is output as part of an address sequence 321 , there is only a two cycle latency. This is described in additional detail with reference to FIG. 4, where there is shown a flow diagram depicting an exemplary embodiment of an address generation flow 400 of address generator 220 of FIG. 3. Flow 400 is further described with simultaneous reference to FIGS. 3 and 4.
At 401 , block and skip sizes, such as block size 201 and skip size 202, are obtained. At 402, initialization sizes, such as initialization values l(x) 203-1 and A(x) 203-2, and a step size, such as step size 204, are obtained from storage responsive to values obtained at 401. At 403, a sum is generated, such as by adder 311 , as previously described. At 404, a sum is generated by adder 321 , as previously described. It should be appreciated that sums generated at 403 and 404 are generated concurrently, namely in parallel.
At 405, the sum generated at 403 is used in generating a difference, such as by subtractor 312. Again, this difference is in a range of -K to -1. The difference generated at 405 is provided for generating another sum at 404 on a next cycle.
At 406, a sum is generated, such as by adder 322, using the sum generated at 404. Again, generating of a difference at 405 and generating of a sum at 406 was previously described with reference to FIG. 3, and is not repeated here for purposes of clarity. Again, the range of the sum generated at 406 is from 0 to K- 1. Furthermore, an address may be output at 406, such as address 221.
The address output at 406 is fed back to generate another sum at 404, in case the sequence is not completed. Moreover, the difference generated at 405 is fed back to generate another sum at 403, in case the sequence is not completed.
From output at 406, it may be determined whether the sequence is to be incremented at 407. For a hardware implementation, a counter (not shown) coupled to receive clock signal 301 may be preset for a linear sequence responsive to a step size 204 and/or a block size 201. However, for an implementation in software, including firmware, a decision may be made. If the sequence is to be incremented, then at 408 the sequence is incremented, namely x, or i as described below, is incremented, for generating other sums at 403 and 404 on a next clock cycle. Accordingly, the sequence of operations may be in hardware, software, or a combination thereof.
If at 407, it is determined that the sequence is not to be incremented, then at 409, it may be determined whether there is another sequence to be processed. If at 409 it is determined that another sequence is to be processed, then flow 400 returns to 401 for obtaining block and skip sizes for such other sequence. If there is no additional sequence to be processed, then flow 400 ends at 499.
FIG. 5 is a pseudo-code listing depicting an exemplary embodiment of an address generation flow 500. Values are set and initialized as generally indicated at 501 for loop 502.
For FIG. 5, it is assumed that block size K is equal to 256 for a turbo code and that skip value n is equal to two, namely two phases or two sequences being processed simultaneously, for setting block and skip sizes at 503. For this exemplary embodiment, the sequences are an odd sequence and an even sequence. For the even sequence, x starts at 0, and for the odd sequence x starts at 1. Accordingly, for the even sequence, initialization value ("A_cand[x]") 203-2(even) is Equation (1 ) with x equal to 0. Furthermore, for the even sequence, initialization value ("l_cand[x]") 203-1 (even) is Equation (2) with x set equal to 0. It should be appreciated that both initialization values 203-1 and 203- 2 for an even sequence reduce to respective constants, as coefficients fi and h are constants.
For an odd sequence, x starts at 1 , and thus substituting x equal to 1 in Equation (1 ) yields an initialization value 203-2(odd), and substituting x equal to 1 in Equation (2) yields initialization value 203-1 (odd). Likewise, it should be appreciated that initialization values 203-1 and 203-2 for an odd sequence each reduce to constants.
Step size 204 is not dependent on x as indicated in Equation (3), and thus step size ("s") 204 is a constant value. By constant values with respect to initialization values 203-1 and 203-2 for odd and even sequences, as well as step size 204, it should be understood that these are constants for one or more sequences of a data block. In this example, there are two threads or streams, but more than two threads may be implemented. As x is incremented as part of a linear sequence, initialization address candidate ("A_cand[x]") and increment candidate ("l_cand[x]") progress for each increase in x. Thus for a first phase, namely an even sequence in this example, x is of the sequence 0, 2, 4,...,K-2, and for a second phase, x has a progression of 1 , 3, 5,...,K-1 , for this exemplary embodiment.
An address candidate is positive on a first iteration for a sequence, so it may be output directly. Furthermore, an increment candidate is positive on a first iteration for a sequence, so has a block size subtracted therefrom. Thus, for x equal to 0, the first address value output for the even sequence is initialization value 203-2(even), namely 0, and the initial stage output for such first iteration is initialization value 203-1 (even) minus K. By first iteration, it should be understood that there may be some cycle latency as previously described, and thus the first iteration means the first valid output. For the second iteration, namely the second valid output but the first for the odd sequence, the address candidate is positive and thus it may be output directly, namely without addition of K, and the increment candidate is positive on the second iteration, so it has the block size subtracted from it. Thus, on a second iteration, initialization value 203-2(odd) is output as address 221 of FIG. 3, and initialization value 203-1 (odd) minus K is output as stage output 302. Again, step size 204 is a constant which may be initialized as it depends only on skip value n for both odd and even phases. In other words, both odd and even phases have the same step sizes. It is not necessary that skip value be set for n equal to 2. In other words larger skip values may be used or skip value n may be set equal to 1. Furthermore, even though a block size of K equal to 256 is described for purposes of clarity by way of example and not limitation, it should be understood that block sizes greater than or less than 256 may be used. Furthermore, even though a fixed block size is used for this example for purposes of clarity, it should be appreciated that a variable block size may be used. Thus, it is not necessary to use an odd and even sequence or even to alternate among multiple sequences using skip value. For example, skip value may be set to some fraction of the block size. It is not necessary for the linear sequence to progress all the way from 0 through to K- 1 , but some fraction of a sequence may be processed. However, for purposes of clarity by way of example and not limitation, it shall be assumed that the entire sequence from 0 to K- 1 is processed in loop 502.
It is not necessary that x have initialization values corresponding to skip value. For example, x may be reinitialized at a fraction of the block size.
Continuing the above example for K equal to 256, if x was to be initialized again at one half of K, then x equal to 128 would be substituted into Equations (1 ) and (2) for generating initialization values 203-2 and 203-1 , respectively, for such processing. However, the first value, namely x equal to 0 in this sequence would be as previously described.
At 511 , an increment i is set as going from 0 to K- 1 for loop 502. If the address candidate is negative, then the block size K is added to the address candidate as indicated at 512. If the increment candidate is positive, then block size K is subtracted at indicated at 513. At 514, the next address candidate for a then current phase is calculated.
At 515, the next increment candidate for a then current phase is calculated. At 516, an address for the current phase is output. Loop 502 in this example is for i from 0 to K- 1 in increments of one, and when i is equal to K- 1 after 516, then loop 502 ends at 517. Even though address generation flow 500 has been described for multiple threads or sequences, it should be understood that such flow may be reduced down for a single sequence, in which case only one set of address and increment candidates would be obtained. Furthermore, it should be understood that more than two sets of address and increment candidates may be incremented for more than two threads or phases.
While the foregoing describes exemplary embodiment(s) in accordance with one or more aspects of the invention, other and further embodiment(s) in accordance with the one or more aspects of the invention may be devised without departing from the scope thereof, which is determined by the claim(s) that follow and equivalents thereof. For example, initialization may take place before any register in each engine whereas the above description assumes initialization using the logic located in front of or just before an initial register of each engine. In other words, the exemplary embodiments just happen to show initialization in loadable adders 311 and 321 before registers 314 and 324, respectively, of FIG. 3. Initialization was assumed to be in adders 311 and 321 because these adders are less complex as they do not involve respective multiplexers. However, initialization may take place by at a loadable subtractor 312 and a loadable adder 322. Or both streams may be initialized at once rather than sequentially. So the difference from subtractor 312 and the sum from adder 322 may be initialized for a first sequence at the same time as the sum from adder 311 and the sum from adder 321 are initialized for a second sequence. Also when extra registers are inserted to allow for one or more extra streams, there may be no logic in front of such registers, and thus such registers may be used for initialization. Furthermore, if a first stream/sequence used first and third initialization values and a second stream/sequence used second and fourth initialization values, it should be understood that such first and second streams/sequences may be completely independent of one another and each may be started at any point in a block though both may not have a same starting point. However, the first steam/sequence does not necessarily have to be initialized before or after the second stream/sequence. Furthermore, where the third initialization value corresponds to the same stream/sequence as the first initialization value, and where the third initialization value initializes the second processing engine, the first initialization value may be used to initialize the first processing engine for the same stream/sequence with a specific start location between 0 and K-1
(inclusive). Similarly, the second initialization value and the fourth initialization value may correspond to the same stream/sequence.
Although the invention has been described with reference to particular embodiments thereof, it will be apparent to one of ordinary skill in the art that modifications to the described embodiment may be made without departing from the spirit of the invention. Accordingly, the scope of the invention will be defined by the attached claims and not by the above detailed description. It is noted that claims listing steps do not imply any order of the steps and that trademarks are the property of their respective owners.

Claims

What is claimed is: 1. An address generator, comprising: a first processing unit; and a second processing unit coupled to receive a stage output from the first processing unit and configured to provide an address output, wherein the stage output is in a first range from -K to -1 for a block size of K, and the address output is in a second range from 0 to K- 1.
2. The address generator according to claim 1 , wherein the address generator is part of a coding device selected from a group consisting of an encoder, a decoder, and a codec, wherein the address generator provides the address output for quadratic permutation polynomial interleaving.
3. The address generator according to claim 2, wherein the address output includes multiple address sequences.
4. The address generator according to claim 3, wherein the first processing unit and the second processing unit are respectively initialized with a first initialization value or a second initialization value.
5. The address generator according to claim 4, wherein the first initialization value is for a first sequence of the multiple address sequences; and wherein the second initialization value is for a second sequence of the multiple address sequences.
6. The address generator according to claim 2, wherein: the address output is for at least part of an address sequence from 0 to
K-1 ; the first processing unit is initialized with a first initialization value and a second' initialization value; and the second processing unit is initialized with a third initialization value and a fourth initialization value.
7. The address generator according to claim 1 , wherein the first processing unit comprises: a first adder; a first register, coupled to the first adder; a first multiplexer, coupled to the first register; a first subtractor, coupled to the first multiplexer and the first register; and a second register, coupled to the subtractor, to output the stage output; wherein the stage output is fed-back to the first adder.
8. The address generator according to claim 7, wherein the first register processes a first sequence, and the second register simultaneously processes a second sequence.
9. The address generator according to claim 8, wherein the second processing unit comprises: a second adder to receive the stage output; a third register, coupled to the second adder ; a second multiplexer, coupled to the third register; a third adder, coupled to the second multiplexer and the third register; and a fourth register, coupled to the third adder, to output the address output; wherein the address output is fed-back to an input of the second adder.
10. A method for generating addresses, comprising: obtaining a step size and a block size; obtaining a first initialization value and a second initialization value; adding the step size to a difference to provide a first sum; subtracting either a null value or the block size from the first sum responsive to a sign bit of the first sum to provide another difference, wherein the other difference is in a range of -K to -1 for block size of K; registering the first sum or the other difference; and feeding back the other difference in order to add the other difference to the step size.
11. The method according to claim 10, further comprising: generating a second sum by adding the other difference to a third sum; adding either the null value or the block size to the second sum in response to a sign bit of the second sum to provide another third sum, wherein the other third sum is in a range of 0 to K- 1 ; registering the second sum or the other third sum; and feeding back the other third sum for another iteration of the step for adding to provide the second sum.
12. The method according to claim 11 , wherein the registering the first sum or the other difference includes registering the other difference within respective feedback loops for pipelined operation, and wherein the registering the second sum or the other third sum includes registering the other third sum within respective feedback loops for pipelined operation.
13. The method according to claim 11 , wherein the registering the first sum or the other difference includes registering the first sum within respective feedback loops for pipelined operation, and wherein the registering the second sum or the other third sum includes registering the second sum within respective feedback loops for pipelined operation.
14. The method according to claim 11 , wherein the step of adding the step size to the difference to provide the first sum is performed simultaneously with the step of adding to provide the second sum by addition of the other difference to the third sum.
15. The method according to claim 11 , further comprising providing the other third sum for quadratic permutation polynomial interleaving.
PCT/US2009/051224 2008-09-18 2009-07-21 Address generation WO2010033298A1 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
JP2011527850A JP5242796B2 (en) 2008-09-18 2009-07-21 Address generation
KR1020117008803A KR101263152B1 (en) 2008-09-18 2009-07-21 address generation
EP09790666.3A EP2329362B1 (en) 2008-09-18 2009-07-21 Address generation
CN200980136779.7A CN102160032B (en) 2008-09-18 2009-07-21 Address produces

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US12/233,320 2008-09-18
US12/233,320 US8219782B2 (en) 2008-09-18 2008-09-18 Address generation

Publications (1)

Publication Number Publication Date
WO2010033298A1 true WO2010033298A1 (en) 2010-03-25

Family

ID=41278507

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2009/051224 WO2010033298A1 (en) 2008-09-18 2009-07-21 Address generation

Country Status (6)

Country Link
US (1) US8219782B2 (en)
EP (1) EP2329362B1 (en)
JP (1) JP5242796B2 (en)
KR (1) KR101263152B1 (en)
CN (1) CN102160032B (en)
WO (1) WO2010033298A1 (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8959403B2 (en) * 2006-11-10 2015-02-17 Optis Wireless Technology, Llc QPP interleaver/de-interleaver for turbo codes
EP2433372B1 (en) * 2009-10-09 2018-11-14 LG Electronics Inc. Method and apparatus for transmitting encoded signals with frequency hopping environment
TW201209711A (en) * 2010-08-19 2012-03-01 Ind Tech Res Inst Address generation apparatus and method for quadratic permutation polynomial interleaver
US8848842B2 (en) 2012-08-16 2014-09-30 Xilinx, Inc. Recursion unit scheduling
US9965278B1 (en) * 2016-12-20 2018-05-08 Texas Instruments Incorporated Streaming engine with compressed encoding for loop circular buffer sizes
US10621132B1 (en) 2017-05-05 2020-04-14 Xilinx, Inc. Auto address generation for switch network
TWI709046B (en) * 2019-09-09 2020-11-01 英業達股份有限公司 Complex programmable logic device with capability of multiple addresses response and operation method thereof

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0047842A2 (en) * 1980-09-15 1982-03-24 International Business Machines Corporation Skewed matrix address generator

Family Cites Families (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3883852A (en) * 1973-04-20 1975-05-13 Corning Glass Works Image scanning converter for automated slide analyzer
DE3371947D1 (en) * 1982-12-20 1987-07-09 Radiotechnique Sa Generator of random number sequences
SE465393B (en) * 1990-01-16 1991-09-02 Ericsson Telefon Ab L M ADDRESS PROCESSOR FOR A SIGNAL PROCESSOR
US5687342A (en) * 1991-09-18 1997-11-11 Ncr Corporation Memory range detector and translator
JPH07253922A (en) * 1994-03-14 1995-10-03 Texas Instr Japan Ltd Address generating circuit
KR970011794B1 (en) * 1994-11-23 1997-07-16 한국전자통신연구원 Hadamard transformer using memory cell
KR100189539B1 (en) 1996-09-06 1999-06-01 윤종용 Interleaving and deinterleaving address generator
KR100193846B1 (en) * 1996-10-02 1999-06-15 윤종용 Interleaved Read Address Generator
KR100236536B1 (en) * 1997-01-10 1999-12-15 윤종용 Modulo address generator
KR100373965B1 (en) * 1998-08-17 2003-02-26 휴우즈 일렉트로닉스 코오포레이션 Turbo code interleaver with near optimal performance
US6871303B2 (en) * 1998-12-04 2005-03-22 Qualcomm Incorporated Random-access multi-directional CDMA2000 turbo code interleaver
KR100306282B1 (en) * 1998-12-10 2001-11-02 윤종용 Apparatus and for interleaving and deinterleaving frame date in communication system
US6314534B1 (en) * 1999-03-31 2001-11-06 Qualcomm Incorporated Generalized address generation for bit reversed random interleaving
KR100480286B1 (en) * 1999-04-02 2005-04-06 삼성전자주식회사 Address generating apparatus and method for turbo interleaving
US6782447B2 (en) * 1999-12-17 2004-08-24 Koninklijke Philips Electronics N.V. Circular address register
KR100393608B1 (en) * 2000-09-29 2003-08-09 삼성전자주식회사 An internal interleaver of the turbo decoder in an umts system and method for interleaving thereof
AUPR679401A0 (en) * 2001-08-03 2001-08-30 Lucent Technologies Inc. High speed add-compare-select processing
JP2003091923A (en) * 2001-09-18 2003-03-28 Sony Corp Re-sampling address generator circuit
US7058874B2 (en) * 2002-05-24 2006-06-06 Lucent Technologies Inc. Interleaver address generator and method of generating an interleaver address
US6865660B2 (en) * 2002-06-28 2005-03-08 Micron Technology, Inc. Method and apparatus for generating deterministic, non-repeating, pseudo-random addresses
US6851039B2 (en) * 2002-09-30 2005-02-01 Lucent Technologies Inc. Method and apparatus for generating an interleaved address
US7262716B2 (en) * 2002-12-20 2007-08-28 Texas Instruments Incoporated Asynchronous sample rate converter and method
US20050044119A1 (en) * 2003-08-21 2005-02-24 Langin-Hooper Jerry Joe Pseudo-random number generator
US7552156B2 (en) * 2004-08-30 2009-06-23 Nunes Ryan J Random number generator
EP1746496B1 (en) * 2005-07-19 2010-05-05 Emma Mixed Signal C.V. Modulo arithmetic
US7502909B2 (en) * 2005-10-11 2009-03-10 Motorola, Inc. Memory address generation with non-harmonic indexing
EP1840734A1 (en) * 2006-03-24 2007-10-03 Telefonaktiebolaget LM Ericsson (publ) Processor with address generator
US8065588B2 (en) * 2007-01-17 2011-11-22 Broadcom Corporation Formulaic flexible collision-free memory accessing for parallel turbo decoding with quadratic polynomial permutation (QPP) interleave
US7873893B2 (en) * 2007-02-28 2011-01-18 Motorola Mobility, Inc. Method and apparatus for encoding and decoding data
US8296627B2 (en) * 2007-07-20 2012-10-23 Electronics And Telecommunications Research Institute Address generation apparatus and method of data interleaver/deinterleaver
US8140932B2 (en) * 2007-11-26 2012-03-20 Motorola Mobility, Inc. Data interleaving circuit and method for vectorized turbo decoder

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0047842A2 (en) * 1980-09-15 1982-03-24 International Business Machines Corporation Skewed matrix address generator

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ERICSSON: "Quadratic Permutation Polynomial Interleavers for LTE Turbo Coding", 3GPP DRAFT; R1-063137, 3RD GENERATION PARTNERSHIP PROJECT (3GPP), MOBILE COMPETENCE CENTRE ; 650, ROUTE DES LUCIOLES ; F-06921 SOPHIA-ANTIPOLIS CEDEX ; FRANCE, vol. RAN WG1, no. Riga, Latvia; 20061101, 1 November 2006 (2006-11-01), XP050103593 *
YANG SUN ET AL: "Configurable and scalable high throughput turbo decoder architecture for multiple 4G wireless standards", APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS, 2008. ASAP 2008. INTERNATIONAL CONFERENCE ON, IEEE, PISCATAWAY, NJ, USA, 2 July 2008 (2008-07-02), pages 209 - 214, XP031292402, ISBN: 978-1-4244-1897-8 *

Also Published As

Publication number Publication date
CN102160032B (en) 2016-08-31
KR101263152B1 (en) 2013-05-15
US20100070737A1 (en) 2010-03-18
JP5242796B2 (en) 2013-07-24
EP2329362B1 (en) 2019-10-02
US8219782B2 (en) 2012-07-10
JP2012503248A (en) 2012-02-02
CN102160032A (en) 2011-08-17
KR20110069108A (en) 2011-06-22
EP2329362A1 (en) 2011-06-08

Similar Documents

Publication Publication Date Title
EP2329362B1 (en) Address generation
US8145877B2 (en) Address generation for quadratic permutation polynomial interleaving
US7559007B1 (en) Encoding and decoding with puncturing
US7139361B1 (en) Counter-based digital frequency synthesizer circuits and methods
US9413390B1 (en) High throughput low-density parity-check (LDPC) decoder via rescheduling
CN108008932B (en) Division synthesis
US8090755B1 (en) Phase accumulation
US10727873B1 (en) System and method for successive cancellation list decoding of polar codes
US10101969B1 (en) Montgomery multiplication devices
US20020083391A1 (en) Method and apparatus for encoding a product code
KR100648178B1 (en) Bit Manipulation Operation Circuit and Method in Programmable Processor
US10474390B1 (en) Systems and method for buffering data using a delayed write data signal and a memory receiving write addresses in a first order and read addresses in a second order
US10484021B1 (en) Log-likelihood ratio processing for linear block code decoding
US6925479B2 (en) General finite-field multiplier and method of the same
US8332735B1 (en) Generating a log-likelihood ratio for signal processing
US9244885B1 (en) Pipelined phase accumulator
Asghar et al. Towards radix-4, parallel interleaver design to support high-throughput turbo decoding for re-configurability
Sugier Low-cost hardware implementations of Salsa20 stream cipher in programmable devices
CN112463116A (en) Method and circuit for dividing combinational logic
Tripathi et al. Unified 3GPP and 3GPP2 turbo encoder FPGA implementation using run-time partial reconfiguration
Ilnseher et al. A monolithic LTE interleaver generator for highly parallel SMAP decoders
Liang et al. A CRT-based BCH encoding and FPGA implementation
Mihajloska Trpcheska et al. Programmable processing element for crypto-systems on FPGAs
Venkatesh et al. High speed and low complexity XOR-free technique based data encoder architecture
US7007059B1 (en) Fast pipelined adder/subtractor using increment/decrement function with reduced register utilization

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 200980136779.7

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09790666

Country of ref document: EP

Kind code of ref document: A1

DPE1 Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101)
WWE Wipo information: entry into national phase

Ref document number: 2009790666

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 1856/CHENP/2011

Country of ref document: IN

WWE Wipo information: entry into national phase

Ref document number: 2011527850

Country of ref document: JP

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 20117008803

Country of ref document: KR

Kind code of ref document: A