US20040114609A1 - Interconnection system - Google Patents

Interconnection system Download PDF

Info

Publication number
US20040114609A1
US20040114609A1 US10/468,167 US46816704A US2004114609A1 US 20040114609 A1 US20040114609 A1 US 20040114609A1 US 46816704 A US46816704 A US 46816704A US 2004114609 A1 US2004114609 A1 US 2004114609A1
Authority
US
United States
Prior art keywords
interconnection system
node
bus
data
interconnection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US10/468,167
Inventor
Ian Swarbrick
Paul Winser
Stuart Ryan
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Clearspeed Solutions Ltd
Original Assignee
ClearSpeed Technology Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from GB0103678A external-priority patent/GB0103678D0/en
Priority claimed from GB0103687A external-priority patent/GB0103687D0/en
Priority claimed from GB0121790A external-priority patent/GB0121790D0/en
Application filed by ClearSpeed Technology Ltd filed Critical ClearSpeed Technology Ltd
Assigned to CLEARSPEED TECHNOLOGY LIMITED reassignment CLEARSPEED TECHNOLOGY LIMITED ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: RYAN, STUART, SWARBRICK, IAN, WINSER, PAUL
Publication of US20040114609A1 publication Critical patent/US20040114609A1/en
Assigned to CLEARSPEED SOLUTIONS LIMITED reassignment CLEARSPEED SOLUTIONS LIMITED CHANGE OF NAME (SEE DOCUMENT FOR DETAILS). Assignors: CLEARSPEED TECHNOLOGY LIMITED
Abandoned legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/04Generating or distributing clock signals or signals derived directly therefrom
    • G06F1/10Distribution of clock signals, e.g. skew
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/80Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
    • G06F15/8007Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors single instruction multiple data [SIMD] multiprocessors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/30Circuit design
    • G06F30/32Circuit design at the digital level
    • G06F30/327Logic synthesis; Behaviour synthesis, e.g. mapping logic, HDL to netlist, high-level language to RTL or netlist
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/54Store-and-forward switching systems 
    • H04L12/56Packet switching systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/74Address processing for routing
    • H04L45/742Route cache; Operation thereof

Definitions

  • the present invention relates to an interconnection network. In particular, but not exclusively, it relates to an intra chip interconnection network.
  • a typical bus system for interconnecting a plurality of functional units consists of either a set of wires with tri-state drivers, or two uni-directional data-paths incorporating multiplexers to get data onto the bus. Access to the bus is controlled by an arbitration unit, which accepts requests to use the bus, and grants one functional unit access to the bus at any one time.
  • the arbiter may be pipelined, and the bus itself may be pipelined in order to achieve a higher clock rate.
  • the system may comprise a plurality of routers which typically comprise a look up table. The data is then compared with the entries within the routing look up table in order to route the data onto the bus to its correct destination.
  • Another form of interconnection is a direct interconnection network.
  • These types of networks typically comprise a plurality of nodes, each of which is a discrete router chip.
  • Each node (router) may connect to a processor and to a plurality of other nodes to form a network topology.
  • the object of the present is to provide an interconnection network as an on-chip bus system. This achieved by routing data on the bus as opposed to broadcasting data.
  • the routing of the data being achieved by a simple addressing scheme in which each transaction has routing information associated therewith, for example a geographical address, which enables the nodes within the interconnection network to route the transaction to its correct destination.
  • the routing information contains information on the direction to send the data packet.
  • This routing information is not merely an address of a destination but provides directional information, for example x,y coordinates of a grid to give direction.
  • the nodes do not need routing table(s) or global signals to determine the direction since all the information the node needs is contains in the routing information of the data packets. This enables the circuitry of the node and the interconnection system to be simplified making integrated of the system onto a chip feasible.
  • each functional unit is connected to a node, and all nodes are connected together, then a pipeline connection will exist between each pair of nodes in the system.
  • the number of intervening nodes will govern the number of pipeline stages. If there is pair of joined nodes where the distance between them is too great to transmit data within a single clock cycle, a repeater block can be inserted between the nodes. This block registers the data, while maintaining the same protocol as the other bus blocks. The inclusion of the repeater blocks allows interconnection of arbitrary length to be created.
  • the interconnection system according the present invention can be utilised in an intra-chip interconnection network.
  • Data transfers are all packetized, and the packets may be of any length that is a multiple of the data-path width.
  • the nodes of the bus used to create the interconnection network nodes and T-switches all have registers on the data-path(s).
  • the main advantage of the present invention is that it is inherently re-usable.
  • the implementer need only instantiate enough functional blocks to form an interconnection of the correct length, with the right number of interfaces, and with enough repeater blocks to achieve the desired clock rate.
  • the interconnection system in accordance with the present invention employs distributed arbitration.
  • the arbitration capability grows as more blocks are added. Therefore, if the bus needs to be lengthened, it is a simple matter of instantiating more nodes and possibly repeaters. Since each module manages its own arbitration within itself, the overall arbitration capability of the interconnect increases. This makes the bus system of the present invention more scalable (in length and overall bandwidth) than other conventional bus systems.
  • the interconnection in accordance with the present invention is efficient in terms of power consumption. Since packets are routed, rather than broadcast, only the wires between the source and destination node are toggled. The remaining bus drivers are clock-gated. Hence the system of the present invention consumes less power.
  • every node on the bus has a unique address associated with it; an interface address.
  • a field in the packet is reserved to hold a destination interface address.
  • Each node on the bus will interrogate this field of an incoming packet; if it matches its interface address it will route the packet off the interconnection (or bus), if it does not match it will route the packet down the bus.
  • the addressing scheme could be extended to support “wildcards” for broadcast messages; if a subset of the address matches the interface address then the packet is routed off the bus and passed on down the bus, otherwise it is just sent on down the bus.
  • each interface unit For packets coming on to the bus, each interface unit interrogates the destination interface address of the packet. This is used to decide which direction a packet arriving on the bus from an attached unit is to be routed. In the case of a linear bus, this could be a simple comparison: if the destination address is greater than the interface address of the source of the data then the packets routed “up” the bus, otherwise the packet is routed “down” the bus. This could be extended to each interface unit such that each node maps destination addresses, or ranges of addresses, to directions on the bus.
  • the interface unit sets a binary lane signal based on the result of this comparison.
  • functionality is split between the node and interface unit. All “preparation” of the data to be transported (including protocol requirements) is carried out in the interface unit. This allows greater flexibility as the node is unchanging irrespective of the type of data to be transported, allowing the node to be re-used in different circuits. More preferably the node directs the packet off the interconnection system to a functional unit.
  • the interface unit can carry out the following functions: take the packet from the functional unit, ensure a correct destination module ID, head and tail bit; compare the destination module ID to the local module ID and sets a binary lane signal based on the result of this comparison; pack the module ID, data and any high level (non bus) control signals into a flit; implement any protocol change necessary; and pass the lane signal and flit to the node using the inject protocol.
  • a T-junction or switch behaves in a similar way; the decision here is simply whether to route the packet down one branch or the other. This would typically be done for ranges of addresses; if the address is larger than some predefined value then the packets are routed left, otherwise they are routed right. However, more complex routing schemes could be implemented if required.
  • the addressing scheme can be extended to support inter-chip communication.
  • a field in the address is used to define a target chip address with, for example, 0 in this field representing a local address of the chip.
  • this field will be compared with the pre-programmed address of the chip. If they match then the field is set to zero and the local routing operates as above. If they do not match, then the packet is routed along the bus to the appropriate inter-chip interface in order to be routed towards its final destination.
  • This scheme could be extended to allow a hierarchical addressing scheme to manage routing across systems, sub-systems, boards, groups of chips, as well as individual chips.
  • the system according to the present invention is not suitable for all bus-type applications.
  • the source and destination transactors are decoupled, since there is no central arbitation point.
  • the advantage of the approach of the present invention is that long buses (networks) can be constructed, with very high aggregrate bandwidth.
  • the system of the present invention is protocol agnostic.
  • the interconnection of the present invention merely transports data packets.
  • Interface unit in accordance with the present invention manage all protocol specific features. This means that it easy to migrate to a new protocol, since only the interface units need to be re-designed.
  • the present invention also provides flexible topology and length.
  • the repeater blocks of the present invention allow very high clock rates in that the overall modular structure of the interconnection prevents the clock rate being limited by long wires. This simplifies the synthesis and layout.
  • the repeater blocks not only pipeline the data as it goes downstream but they implement a flow control protocol; pipeline blockage information up the interconnection (or bus) (rather than a blocking signal being distributed globally).
  • a feature of this mechanism is that data compression (inherent buffering) is achieved on the bus at least double the latency figure, i.e. if latency through the repeater is one cycle then two data control flow digits (flits: the basic unit of data transfer over the interconnection of the present invention. It includes n bytes of data, as well as some side-band control signals.
  • the flow control digit size equals the size of the bus data-path) will concatenate when it is blocked. This means that the scope of any blocking is minimised and thus reducing any queuing requirement in a functional block.
  • the flow of data flow control digits is managed by a flow control protocol, in conjunction with double-buffering in a store block (and repeater unit) as described previously.
  • Customised interface units handle protocol specific features. These typically involve packing and unpacking of control information and data, and address translation. A customised interface unit can be created to match any specific concurrency.
  • Packets are injected (gated) onto the interconnection at each node, so that each node is allocated a certain amount of the overall bandwidth allocation (e.g. by being able to send, say 10 flow control digits within every 100 cycles). This distributed scheme controls the overall bandwidth allocation.
  • FIG. 1 is a block schematic diagram of the system incorporating the interconnection system according to an embodiment of the present invention
  • FIG. 2 is a block schematic diagram illustrating the initiator and target of a virtual component interface system of FIG. 1,
  • FIG. 3 is a block schematic diagram of the node of the interconnection system shown in FIG. 1;
  • FIG. 4 is a block schematic diagram of connection over the interconnection according to the present invention between virtual components of the system shown in FIG. 1;
  • FIG. 5 a is a diagram of the typical structure of the T-switch of FIG. 1;
  • FIG. 5 b is a diagram showing the internal connection of the T-switch of FIG. 5 a;
  • FIG. 6 illustrates the Module ID (interface ID) encoding of the system of FIG. 1;
  • FIG. 7 illustrates handshaking signals in the interconnection system according to an embodiment of the present invention
  • FIG. 9 illustrates blocking for two cycles of the interconnection system according to an embodiment of the present invention:
  • FIG. 10 illustrates virtual component interface handshake according an embodiment of the present invention
  • FIG. 11 illustrates a linear chip arrangement of the system according to an embodiment of the present invention
  • FIG. 12 is a schematic block diagram of the interconnection system of the present invention illustrating an alternative topogly
  • FIG. 13 is a schematic block diagram of the interconnection system of the present invention illustrating a further alternative topogly
  • FIG. 14 illustrates an example of a traffic handling subsystem according to an embodiment of the present invention
  • FIG. 15 illustrates a system for locating chips on a virtual grid according to a method of a preferred embodiment of the present invention.
  • FIG. 16 illustrates routing a transaction according to the method of a preferred embodiment of the present invention.
  • the basic mechanism for communicating data and control information between functional blocks is that blocks exchange messages using the interconnection system 100 according to the present invention.
  • the bus system can be extended to connect blocks in a multi chip system, and the same mechanism works for blocks within a chip or blocks on different chips.
  • An example of a system 100 incorporating the interconnection system 110 according to an embodiment of the present invention, as shown in FIG. 1, comprises a plurality of reusable on-chip functional blocks or virtual component blocks 105 a , 105 b and 105 c .
  • These functional units interface to the interconnection and can be fixed. They can be re-used at various levels of abstraction (eg. RTL, gate level, GDSII layout data) in different circuit design. The topology can be fixed once the size, aspect ratio and the location of the I/O's to the interconnection are known.
  • Each on-chip functional unit 105 a , 105 b , 105 c are connected to the interconnection system 110 via its interface unit.
  • the interface unit handles address decoding and protocol translation.
  • the on-chip functional block 105 a for example, is connected to the interconnection system 110 via an associated virtual component interface initiator 115 a and peripheral virtual component interface initiator 120 a.
  • the on-chip functional block 105 b is connected to the interconnection system 110 via an associated virtual component interface target 125 b and peripheral virtual component interface target 130 b.
  • the on-chip functional block 105 c is connected to the interconnection system 110 via an associated virtual component interface initiator 115 c and peripheral virtual component interface target 130 c.
  • the associated initiators and targets for each on-chip functional block shown in FIG. 1 are purely illustrative and may vary depending on the associated block requirements.
  • a functional block may have a number of connections to the interconnection system. Each connection has an advanced virtual component interface (extensions forming a superset of basic virtual component interface. This is the protocol used for the main data interfaces in the system of the present invention) or peripheral virtual component interface interface (low bandwidth interface allowing atomic operations, mainly used in the present invention for control register access).
  • Virtual component interface is an OCB standard interface to communicate between a bus and/or virtual component, which is independent of any specific bus or virtual component protocol.
  • peripheral virtual component interface 120 a , 130 b , 130 c There are three types of virtual component interfaces, peripheral virtual component interface 120 a , 130 b , 130 c , basic virtual component interface and advanced virtual component interface.
  • the basic virtual component interface is a wider, higher bandwidth interface than the peripheral virtual component interface.
  • the basic virtual component interface allows split transactions. Split transaction are where the request for data and the response are decoupled, so that a request for data does not need to wait for the response to be returned before initiating further transactions.
  • Advanced virtual component interface is a superset of basic virtual component interface; Advanced virtual component interface and peripheral virtual component interface have been adopted in the system according to the embodiment of the present invention.
  • the advanced virtual component interface unit comprises a target and initiator.
  • the target and initiator are virtual components that send request packets and receive response packets.
  • the initiator is the agent that initiates transactions, for example, DMA (or EPU on F150).
  • an interface unit that initiates a read or write transaction is called an initiator 210 (issues a request 220 ), while an interface that receives the transaction is called the target 230 (responds to a request 240 ).
  • an initiator 210 issues a request 220
  • an interface that receives the transaction is called the target 230 (responds to a request 240 ).
  • each no-chip functional block 105 a , 105 b and 105 c and its associated initiators and targets are made using virtual component interface protocol.
  • Each initiator 115 a , 120 a , 115 c and target 125 b , 130 b , 130 c are connected to a unique node 135 , 140 , 145 , 150 , 155 and 160 .
  • Communication between each initiator 115 a , 120 a , 115 c and target 125 b , 130 b , 130 c uses the protocol in accordance with the embodiment of the present invention and as described in more detail below.
  • the interconnection system 110 comprises three separate buses 165 , 170 and 175 .
  • the RTL components have parameterisable widths, so these may be three instances of different width.
  • An example might be a 64-bit wide peripheral virtual component bus 170 (32 bit address+32 data bits), a 128-bit advanced virtual component interface bus 165 , and a 256-bit advanced virtual component interface bus 175 .
  • three separate buses are illustrated here, it is appreciated that the interconnection system of the present invention may incorporate any number of separate buses.
  • a repeater unit 180 may be inserted for all the buses 165 , 170 and 175 . There is no restriction on the length of the buses 165 , 170 and 175 . Variations in the length of the buses 165 , 170 and 170 would merely require an increased number of repeater units 180 . Repeater units would of course only be required when the timing contraints between two nodes cannot be met due to the length of wire of the interconnection.
  • T-switches (3-way connectors or the like) 185 can be provided.
  • the interconnection system of the present invention can be used in any topology but care should be taken when the topology contains loops as deadlock may result.
  • Data is transferred on the interconnection network of the present invention in packets.
  • the packets may be of any length that is a multiple of the data-path width.
  • the nodes 135 , 140 , 145 , 150 , 155 and 160 according to the present invention used to create the interconnection network (node and T-switch) all have registers on the data-path(s).
  • Each interface unit is connected to a node within the interconnection system itself, and therefore to one particular lane of the bus. Connections may be of initiator or target type, but not both—following from the conventions of virtual component interface. In practise every block is likely to have a peripheral virtual component interface target interface for configuration and control.
  • bus components according to the embodiment of the present invention use distributed arbitration, where each block in the bus system manages access to its own resources.
  • a node 135 according to the embodiment of the present invention is illustrated in FIG. 3.
  • Each node 135 , 140 , 145 , 150 , 155 and 160 are substantially similar.
  • Node 135 is connected to the bus 175 of FIG. 1.
  • Each node comprises a first and second input store 315 , 320 .
  • the first input store 315 has an input connected to a first bus lane 305 .
  • the second input store 320 has an input connected to a second bus lane 310 .
  • the output of the first input store 315 is connected to a third bus lane 306 and the output of the second input store 320 is connected to a fourth bus lane 311 .
  • Each node further comprises an inject control unit 335 and a consume control unit 325 .
  • the node may not require the consume arbitration, for example the node may have an output for each uni-directional lane but the consume handshaking retained.
  • the input of the inject control unit 335 is connected to the output of an interface unit of the respective functional unit for that node.
  • the outputs of the inject control unit 335 are connected to a fifth bus lane 307 arid sixth bus lane 312 .
  • the input of the consume control unit 325 is connected to the output of a multiplexer 321 .
  • the inputs of the multiplexer 321 are connected to the fourth bus lane 311 and the third bus lane 306 .
  • the output of the consume control unit 325 is connected to a bus 330 which is connected to the interface unit of the respective functional unit for that node.
  • the fifth bus lane 307 and the third bus lane 306 are connected to the inputs of a multiplexer 308 .
  • the output of the multiplexer 308 is connected to the first bus lane 305 .
  • the fourth bus lane 311 and the sixth bus lane 312 are connected to the inputs of a multiplexer 313 .
  • the output of the multiplexer 313 is connected to the second bus lane 310 .
  • the nodes are the connection points where data leaves or enters the bus. It also forms part of the transport medium.
  • the node forms part of the bus lane which it connects to, including both directions of data path.
  • the node conveys data on the lane to which i connects, with one cycle of latency when not blocked. It also allows the connecting functional block to inject and consume data in either direction, via its interface unit. Arbitration of injected or passing data is performed entirely within the node.
  • bus 175 consists of a first lane 305 and a second lane 310 .
  • the first and seond lanes 305 and 310 are physically separate unidirectional buses that are multiplexed and de-multiplexed to the same interfaces within the node 135 .
  • the direction of data flow of the first lane 305 is in the opposite direction to that of the second lane 310 .
  • Each lane 305 and 310 has a lane number.
  • the lane number is a parameter that is passed from the interface unit to the node to determine which lane (and hence which direction) each packet is sent to.
  • the direction of the data flow of the first and second lanes 305 and 310 can be in the same direction. This would be desirable if the blocks transacting on the bus only need to send packets in one direction.
  • the node 135 is capable of concurrently receiving and injecting data on the same bus lane. At the same time it is possible to pass data through on the other lane.
  • Each uni-directional lane 305 , 310 carries a separate stream 306 , 307 , 311 , 312 of data. These streams 306 , 307 , 311 , 312 are multiplexed together at the point 321 where data leaves the node 135 into the on-chip module 105 a (not shown here) via the interface unit 115 a and 120 a (not shown here).
  • the data streams 306 , 307 , 311 , 312 are de-multiplexed from the on-chip block 105 a onto the bus lanes 305 and 310 in order to place data on the interconnection 110 .
  • Each lane can independently block or pass data through. Data can be consumed from one lane at a time, and injected on one lane at the same time. Concurrent inject and consume on the same lane is also permitted. Which lane each packet is injected on is determined within the interface unit.
  • Each input store (or register) 315 and 320 registers the data as it passes from node to node.
  • Each store 315 , 320 contains two flit-wide registers. When there is no competition for bus resources, only one of the registers is used. When the bus blocks, both registers are then used. It also implements the ‘block on header’ feature. This is needed to allow packets to be blocked at the header flit so that a new packet can be injected onto the bus.
  • the output interface unit 321 , 325 multiplexes both bus lanes 305 , 310 onto one lane 330 that feeds into the on-chip functional unit 105 a via the interface unit which is connected to the node 135 .
  • the output interface unit 321 , 325 also performs an arbitration function, granting one lane access to the on-chip functional unit, while blocking the other.
  • Each node also comprises an input interface unit 335 .
  • the input interface unit 335 performs de-multiplexing of packets onto one of the bus lanes 305 , 310 . It also performs an arbitration function, blocking the packet that is being input until the requested lane is available.
  • a plurality of repeater units 180 are provided at intervals along the length of the interconnection 110 .
  • Each repeater unit 180 is used to introduce extra registers on the data path. It adds an extra cycle of latency, but is only used where there is a difficulty meeting timing constraints.
  • Each repeater unit 180 comprises a store similar to the store unit of the nodes. The store unit merely passes data onwards, and implements blocking behaviour. There is no switching carried out in the repeater unit.
  • the repeater block allows for more freedom in chip layout. For example, it allows long length of wires between nodes or where a block has a single node to connect to a single lane, repeaters may be inserted into the other lanes in order to produce uniform timing characteristics over all lanes. There may be more than one repeater between two nodes.
  • the system according to the embodiment of the present invention is protocol agnostic, that is to say, the data-transport blocks such as the nodes 135 , 140 , 145 , 150 , 155 , 160 , repeater units 180 and T-switch 185 simply route data packets from a source interface to a destination interface. Each packet will contain control information and data. The packing and unpacking of this information is performed in the interface units 115 a , 120 a , 125 b , 130 b , 115 c , 130 c . In respect of the preferred embodiment, these interface units are virtual component interfaces, but it is appreciated that any other protocol could be supported by creating customised interface units.
  • a large on-chip block may have several interfaces to the same bus.
  • the target and initiator 15 a , 120 a , 125 b , 130 b , 115 c , 130 c of the interface units perform conversion between the advanced virtual component interface and bus protocols in the initiator and from the bus to advanced virtual component interface in the target.
  • the protocol is an asynchronous handshake on the advanced virtual component interface side illustrated in FIG. 10.
  • the interface unit initiator comprises a send path. This path performs conversion between the advanced virtual component interface communication protocol to the bus protocol. It extracts a destination module ID or interface ID.
  • a block may be connected to several buses, with a different module (interface) ID on each bus address, packs it into the correct part of the packet, and uses the module ID in conjunction with a hardwired routing table to generate a lane number (e.g. 1 for right, 0 for left).
  • the initiator blocks the data at the advanced virtual component interface when it cannot be sent onto the bus.
  • the interface unit initiator also comprises a response path. The response path receives previously requested data, converting from bus communication protocol to the virtual component interface protocol. It blocks data on the bus if the on-chip virtual component block is unable to receive it.
  • the interface unit target comprises a send path which receives incoming read and write requests.
  • the target converts from bus communication protocol to advanced virtual component interface protocol. It blocks data on the bus if it cannot be accepted across the virtual component interface.
  • the target also comprises a response path which carries read (and for verification purposes, write) requests. It converts advanced virtual component interface communication protocol to bus protocol and blocks data at the advanced virtual component interface if it cannot be sent onto the bus.
  • the other type of interface unit utilised in the embodiment of the present invention is a peripheral virtual component unit.
  • the main differences between the peripheral virtual component interface and the advanced virtual component interface are the data interface of the peripheral virtual component interface is potentially narrower (up to 4 bytes) than the advanced virtual component interface and the peripheral virtual component interface is not split transaction.
  • the peripheral virtual component interface units perform conversion between the peripheral virtual component interface and bus protocols in the initiator, and from the bus protocol to peripheral virtual component interface protocol in the target.
  • the protocol is an asynchronous handshake on the peripheral virtual component interface side.
  • the interface unit initiator comprises a send path. It generates destination module ID and the transport lane number from memory address. The initiator blocks the data at the peripheral virtual component interface when it cannot be sent onto the bus. The initiator also comprises a response path. This path receives previously requested data, converting from bus communication protocol to the peripheral virtual component interface protocol. It also blocks data on the bus if the on-chip block (virtual component block) is unable to receive it.
  • the peripheral virtual component interface unit target comprises a send path which receives incoming read and write requests. It blocks data on the bus if it cannot be accepted across the virtual component interface.
  • the target also comprises a response path which carries read (and for verification purposes, write) requests. It converts peripheral virtual component interface communication protocol to bus protocol and blocks data at the virtual component interface if it cannot be sent onto the bus.
  • the peripheral virtual component interface initiator may comprise a combined initiator and target. This is so that the debug registers (for example) of an initiator can be read from.
  • the virtual component (on-chip) blocks can be connected to each other over the interconnection system according to the present invention.
  • a first virtual component (on-chip) block 425 is connected point to point to an interface unit target 430 .
  • the interface unit target 430 presents a virtual component initiator interface 440 to the virtual component target 445 of on-chip block 425 .
  • the interface unit target 430 uses a bus protocol conversion unit 448 to interface to the bus interconnect 450 .
  • the interface unit initiator 460 presents a target interface 470 to the initiator 457 of the second on-chip block 455 and, again, uses a bus protocol conversion unit 468 on the other side.
  • the T-switch 185 of FIG. 1 is a block that joins 3 nodes, allowing more complex interconnects than simple linear ones.
  • the interface ID of each packet is decoded and translated into a single bit, which represents the two possible outgoing ports.
  • a hardwired table inside the T-Switch performs this decoding. There is one such table for each input port on the T-Switch. Arbitration takes place for the output ports if there is a conflict. The winner may send the current packet, but must yield when the packet has been sent.
  • FIGS. 5 a and 5 b show an example of the structure of a T-switch.
  • the T-switch comprises three sets of input/output ports 505 , 510 , 515 connected to each pair of unidirectional bus lanes 520 , 525 , 530 .
  • a T-junction 535 , 540 , 545 is provided for each pair of bus lanes 520 , 525 , 530 such that an incoming bus 520 coming into an input port 515 can be output via output port 505 or 510 , for example.
  • the T-switch 185 comprises a lane selection unit.
  • the lane selection unit takes in module ID of incoming packets and produces a 1-bit result corresponding to the two possible output ports on the switch.
  • the T-switch also comprises a store block on each input lane. Each store block stores data flow control digits and allows them to block in place if the output port is temporarily unable to receive. It also performs a block on header function, which allows switching to occur at the packet level (rather than the flow control digit level).
  • the T-switch also includes an arbiter for arbitration between requests to use output ports.
  • the interconnection system powers up into a usable state. Routing information is hardcoded into the bus components.
  • a destination module interface ID (mod ID) for example as illustrated in FIG. 6 is all that is required to route a packet to another node. In order for that node to return a response packet, it must have been sent the module interface ID of the sender.
  • Every interconnection may be more than one interconnection in a processing system.
  • every interface (which includes an inject and consume port) has a unique ID. These ID's are hard-coded at silicon compile-time.
  • Units attached to the bus are free to start communicating straight after reset.
  • the interface unit will hold off communications (by not acknowledging them) until it is ready to begin operation.
  • the interconnection system has an internal protocol that is used throughout. At the interfaces to the on-chip blocks this may be converted to some other protocol, for example virtual component interface as described above.
  • the internal protocol will be referred to as the bus protocol This bus protocol allows single cycle latency for packets travelling along the bus when there is no contention for resources, and to allow packets to block in place when contention occurs.
  • the bus protocol is used for all internal (non interface/virtual component interface) data transfers. It consists of five signals: occup 705 , head 710 , tail 715 , data 720 and valid 725 between a sender 735 and a receiver 730 . These are shown in FIG. 7.
  • the packets consist of one or more flow control digits. On each cycle that the sender asserts the valid signal, the receiver must accept the data on the next positive clock edge.
  • the receiver 730 informs the sender 735 about its current status using the occup signal 705 .
  • This is a two-bit wide signal. TABLE I Occup signal values and their meaning.
  • the occup signal 705 tells the sender 735 if and when it is able to send data. When the sender 735 is allowed to transmit a data flow control digit, it is qualified with a valid signal 725 .
  • Each node and T-Switch use these signals to perform switching at the packet level.
  • FIG. 8 shows an example of blocking behaviour on the interconnect system according to an embodiment of the present invention.
  • the occup signal is set to ‘1’, meaning ‘if sending a flow control digit this cycle, don't send one on the next cycle’.
  • FIG. 9 shows an example of the blocking mechanism more completely.
  • the occup signal is set to 01 (binary), then to 10 (binary).
  • the sender can resume transmitting flow control digits when the occup signal is set back to 01—at that point it is not currently sending a flow control digit, so it is able to send one on the next cycle.
  • the protocol at the boundary between the node and the interface unit is different from that just described, and is similar to that used by the virtual component interface.
  • the consume (bus output) protocol is different to the inject protocol but is the minimum logic to allow registered outputs (and thus simplifies synthesis and integration into a system on chip).
  • the bus protocol allows exchange packets consisting of one or more flow control digits. Eight bits in the upper part of the first packet carry the destination module ID, and are used by the bus system to deliver the packet. The top 2 bits are also used for internal bus purposes. In all other bit fields, the packing of the flow control digits is independent of the bus system.
  • virtual component interface protocol is used.
  • the interface control and data fields are packed into bus flow control digits by the sending interface and then unpacked at the receiving interface unit.
  • the main, high-bandwidth, interface to the bus uses the advanced virtual component interface. All features of the advanced virtual component interface are implemented, with the exception of those used to optimise the internal operation of an OCB.
  • the bus interface converts data and control information from the virtual component interface protocol to the bus internal communication protocol.
  • control bits and data are packed up into packets and sent to the destination interface unit, where they are unpacked and separated back into data and control.
  • virtual component interface compliant interface units are utilised, it is appreciated that different interface units may be used instead (e.g. ARM AMBA compliant interfaces).
  • Table II shows the fields within the data flow control digits that are used by the interconnection system according to an embdoiemnt of the present invention. All other information in the flow control digits is simply transported by the bus. The encoding and decoding is performed by the interface units. The interface units also insert the head and tail bits into the flow control digits, and insert the MOD ID in the correct bit fields. TABLE II Specific fields. Name Bit Comments Head FLOW CONTROL Set to ‘1’ to indicate first flow control DIGIT_WIDTH - 1 digit of packet. Tail FLOW CONTROL Set to‘1’ to indicate last flow control digit DIGIT_WIDTH - 2 of packet.
  • Virtual component interface calls FLOW CONTROL this MOD ID. It is really an interface ID, DIGIT_WIDTH - 10 since a large functional unit could have multiple bus interfaces, in which case, it is necessary to distinguish between them.
  • the advanced virtual component interface packet types are read request, write request and read response.
  • a read request is a single flow control digit packet and all of the relevant virtual component interface control fields are packed into the flow control digit.
  • a write request consists of two or more flow control digits.
  • the first flow control digit contains virtual component interface control information (e.g. address).
  • the subsequent flow control digits contain data and byte enables.
  • a read response consists of one or more flow control digits.
  • the first and subsequent flow control digits all contain data plus virtual component interface response fields (e.g. RSCRID, RTRDID and RPKTID).
  • CMDVAL 1 — — — IT Handshake signal WDATA 128 127:0 IT Only for write requests. BE 16 143:128 IT Only for write requests.
  • Peripheral virtual component interface burst-mode read and write transactions are not supported over the bus, as these cannot be efficiently implemented. For the reason, the peripheral virtual component interface EOP signal should be fixed at logic ‘1’. Any additional processing unit or extenal units can be attached to the bus, but the EOP signal should again be fixed at logic ‘1’. With this change, the new unit should work normally.
  • the read request type is a single flow control digit packet carrying the 32-bit address of the data to be read.
  • the read response is a single flow control digit response containing the requested 32 bits of data.
  • the write request is a single flow control digit packet containing the 32-bit address of the location to be written, plus the 32 bits of data, and 4 bits of byte enable. The write response prevents a target responding to a write request in the same way that it would to a read request.
  • the internal addressing mechanism of the bus is based on the assumption that all on-chip blocks in the system have a fixed 8-bit module ID.
  • Virtual component interface specifies the use of a single global address space. Internally the bus delivers packets based on the module ID of each block in the system. The module ID is 8 bits wide. All global addresses will contain the 8 bits module ID, and the interface unit will simply extract the destination module ID from the address. The location of the module ID bits within the address is predetermined.
  • the module IDs in each system are divided into groups. Each group may contain up to 16 modules.
  • the T-switches in the system use the group ID to determine which output port to send each packet to.
  • each group there may be up to sixteen on-chip blocks, each with a unique subID.
  • the inclusion of only sixteen modules within each group does not restrict the bus topology.
  • Within each linear bus section there may be more than one group, but modules from different groups may not interleave. There may be more than sixteen modules between T-switches.
  • the bus system blocks operate at the same clock rate as synthesisable RTL blocks in any given process.
  • the target clock rate is 400 MHz.
  • latency of packets on the bus is one cycle at the sending interface unit, one cycle per bus block (nodes and repeaters) that data passes through, one or two additional cycle(s) at the node consume unit, one cycle at receiving interface unit and n ⁇ 1 cycles, where n is the packet length in flow control digits.
  • the length of the packets is unlimited. However, consideration should be taken with excessively large packets, as they will utilises a greater amount of the bus resource. Long packets can be more efficient than a number of shorter ones, due to the overhead of having a header flow control digit.
  • the interconnection system does not guarantee that requested data items would be returned to a module in the order in which they were requested.
  • the block is responsible for re-ordering packets if the order matters. This is achieved using the advanced virtual component interface pktid field, which is used to tag and reorder outstanding transactions. It cannot be assumed that data will arrive at the on-chip block in the same order that it was requested from other blocks in the system. Where ordering is important, the receiving on-chip block must be able to re-order the packets. Failure to adhere to this rule is likely to result in system deadlock.
  • the interconnection system according to the present invention offers considerable flexibility in the choice of interconnect topology. However, it is currently not advisable to have loops in the topology as these will introduce the possibility of deadlock.
  • a further advantage of the interconnection system according to the present invention is that saturating the bus with packets will not cause it to fail. The packets will be delivered eventually, but the average latency will increase significantly. If the congestion is reduced to below the maximum throughput, then it will return to “normal operation”.
  • the interface units may be incorporated into either the on-chip blocks or an area reserved for the bus, depending on the requirements of the overall design, for example there may be a lot of area free under the bus and so using this area rather than adding the functional block area would make more sense as it would reduce the overall chip area.
  • the interconnection system forms the basis of a system-on-chip platform.
  • the nodes contain the necessary “hooks” to handle distribution of the system clock reset signals.
  • the routing of transactions and responses between initiator and target is performed by the interface blocks that connect to the interconnection system, and any intervening T-switch elements in the system. Addressing of blocks is hardwired and geographic, and the routing information is compiled into the interface and T-switch logic at chip integration time.
  • the platform requires some modularity at the chip level as well as the block level on chips. Therefore, knowledge of what other chips or their innards they are connected to can not be hard-coded in the chips themselves, as this may vary on different line cards.
  • FIG. 11 illustrates an example of a linear chip arrangement.
  • the chips 1100 ( 0 ) to ( 2 ) etc sequentially so that any block in any chip knows that a transaction must be routed “up” or “down” the interconnection 1110 to reach its destination, as indicated in the chip ID-field of the physical address. This is exactly the same process as the block performs to route to another block on the same chip. In this case a 2-level decision is utilised. If in ‘present chip’ then route on Block ID, else route on Chip ID.
  • FIG. 12 An alternative topology is shown in FIG. 12. It comprises a first bus lane 1201 and a second bus lane 1202 arranged in parallel.
  • the first and second bus lane correspond to the interconnection system of the embodiment of the present invention.
  • a plurality of multi threaded array processors (MTAPs) 1210 are connected across the two bus lanes 1201 , 1202 .
  • An network input device 1220 , a collector device 1230 , a distributor device 1240 , an network output device 1250 are connected to the second bus lane 1202 and a table lookup engine 1260 is connected to the first bus lane 1201 . Details of operation of the devices connected to the bus lanes is not provided here.
  • the first bus lane 1201 (256 bits wide, for example) is dedicated to fast path packet data.
  • a second bus lane 1202 (128 bits wide, for example) is used for general non-packet data, such as table lookups, instruction fetching, external memory access, etc.
  • Blocks accessing bus lanes 1201 , 1202 use AVCI protocol.
  • An additional bus lane may be used (not shown here for reading and writing block configuration and status registers. Blocks accessing this lane use the PVCI protocol.
  • Blocks can have multiple bus interfaces, for example. Lane widths can be configured to meet the bandwidth requirements of the system.
  • the interconnection system of the present invention uses point-to-point connections between interfaces, and uses distributed arbitration, it is possible to have several pairs of functional blocks communicating simultaneously without any contention or interference. Traffic between blocks can only interfere if that traffic travels along a shared bus segment in the same direction. This situation can be avoided by choosing a suitable layout. Thus, bus contention can be avoided in the fast path packet flow. This is important to achieve predictable and reliable performance, and to avoid overprovisioning the interconnection.
  • This topology uses a T-junction 1305 to exploit the fact that traffic going in opposite directions on the same bus segment 1300 is non-interfering. Using the T-junction block 1305 may ease the design of the bus topology to account for layout and floor planning constraints.
  • the interconnection system of the present invention preferrably supports advanced virtual component interface transactions, which are simply variable size messages as defined in the virtual component interface standard, sent from an initiator interface to a target interface, possibly followed by a response at a later time. Because the response may be delayed, this is called a split transaction in the virtual component interface system.
  • the network processing system architecture defines two higher levels of abstraction in the inter-block communication protocol, the chunk and the abstract datagram (frequently simply called a datagram).
  • a chunk is a logical entity that represents a fairly small amount of data to be transferred from one block to another.
  • An abstract datagram is a logical entity that represents the natural unit of data for the application.
  • Chunks are somewhat analogous to CSIX C frames, and are used for similar purposes, that is, to have a convenient, small unit of data transfer. Chunks have a defined maximum size, typically about 512 bytes, while datagrams can be much larger, typically up to 9K bytes; the exact size limits are configurable.
  • CSIX C frames CSIX C frames
  • Chunks have a defined maximum size, typically about 512 bytes, while datagrams can be much larger, typically up to 9K bytes; the exact size limits are configurable.
  • the system addressing scheme according to the embodiment of the present invention will now be described in more detail.
  • the system according to an embodiment of the present invention may span a subsystem that is implemented in more than one chip.
  • the routing of transactions and responses between initiators and targets is performed by the interface blocks that connect to the interconnection itself, and the intervening T-switch elements in the interconnection. Addressing according to an embodiment of the present invention of the blocks is hardwired and geographic, and the routing information is compiled into the interface units, T-switch and node elements logic at chip integration.
  • the interface ID occupies the upper part of the physical 64 bit address, the lower bits being the offset within the block. Additional physical bits are reserved for the chip ID to support multi-chip expanses.
  • the platform according to the embodiment of the present invention requires some modularity at the chip level as well as the block level on chips, knowledge of what other chips or their innards they are connected to can not be hard-coded, as this may vary on different line cards. This prevents the use of the same hard-wired bus routing information scheme as exists in the interface units for transactions within one chip.
  • FIGS. 14 and 15 An example of a traffic handler subsystem in which the packet queue memory is implemented around two memory hub chips is shown in FIGS. 14 and 15.
  • the four chips have four connections to other chips. This results in possible ambiguities about the route that a transaction takes from one chip to the next. Therefore, it is necessary to control the flow of transactions by configuring the hardware and software appropriately, but without having to include: programmable routing functions in the interconnection.
  • chip ID for chip 1401 may be 4,2, chip 1403 may be 5,1, chip 1404 may be 5,3 and chip 1402 may be 6,2.
  • Simple, hardwired rules are applied on about how to route the next hop of a transaction destined for another chip.
  • locating the chips on a virtual “grid” such that the local rules produce the transaction flows desired.
  • the grid can be “distorted” by leaving gaps or dislocations to achieve the desired effect.
  • Each chip has complete knowledge of itself, including how many external ports it has and their assignments to N,S,E,W compass points. This knowledge is wired into the inetrface units and T-switches. A chip has no knowledge at all of what other chips are connected to it, or their x,y coordinates.
  • a transaction is routed out on the interface that forms the least angle with its relative grid location.

Abstract

An interconnection system (110) interconnects a plurality of reusable functional units (105 a), (105 b), (105 c). The system (110) comprises a plurality of nodes (135), (140), (145), (150), (155), (160) each node communicating with a functional unit. A plurality of data packets are transported between the functional units. Each data packet has routing information associated therewith to enable a node to direct the data packet via the interconnection system.

Description

    TECHNICAL FIELD
  • The present invention relates to an interconnection network. In particular, but not exclusively, it relates to an intra chip interconnection network. [0001]
  • BACKGROUND OF THE INVENTION
  • A typical bus system for interconnecting a plurality of functional units (or processing units) consists of either a set of wires with tri-state drivers, or two uni-directional data-paths incorporating multiplexers to get data onto the bus. Access to the bus is controlled by an arbitration unit, which accepts requests to use the bus, and grants one functional unit access to the bus at any one time. The arbiter may be pipelined, and the bus itself may be pipelined in order to achieve a higher clock rate. In order to route data along the bus, the system may comprise a plurality of routers which typically comprise a look up table. The data is then compared with the entries within the routing look up table in order to route the data onto the bus to its correct destination. [0002]
  • However, such routing schemes can not be realised on a chip, since the complexity and the size of its components make this infeasible. This has been overcome in existing on-chip bus systems by using a different scheme in which data is broadcasted, that is, transferring the data from one functional units to a plurality of other functional units simultaneously. This avoids the need for routing tables. However, broadcasting data to all functional units on the chip consumes considerable power and is, thus, inefficient. Also it is becoming increasingly difficult to transfer data over relatively long distances in one clock cycle. [0003]
  • Furthermore, in a typical bus system, since every request to use the bus (transactor) must connect to the central arbiter, this limits the scalabilty of the system. As bigger systems are built, the functional units are further from the arbiter, so latency increases and the number of concurrent requests that may be handled by a single arbiter is limited. Therefore, in such central arbiter based bus systems, the length of the bus and the number of transactors are normally fixed at the outset. Therefore, it would not be possible to lengthen the bus at a later stage to meet varying system requirements. [0004]
  • Another form of interconnection is a direct interconnection network. These types of networks typically comprise a plurality of nodes, each of which is a discrete router chip. Each node (router) may connect to a processor and to a plurality of other nodes to form a network topology. [0005]
  • In the past, it has been infeasible to use this network-based approach as a replacement for on-chip buses becauses the individual nodes are too big to be implemented on a chip. [0006]
  • Many existing buses are created to work with a specific protocol. Many of the customised wires relate to specific features of that protocol. Conversely many protocols are based around a specific bus implementation, for example having specific data fileds to aid the arbiter in some way. [0007]
  • SUMMARY OF THE INVENTION
  • The object of the present is to provide an interconnection network as an on-chip bus system. This achieved by routing data on the bus as opposed to broadcasting data. The routing of the data being achieved by a simple addressing scheme in which each transaction has routing information associated therewith, for example a geographical address, which enables the nodes within the interconnection network to route the transaction to its correct destination. [0008]
  • In this way, the routing information contains information on the direction to send the data packet. This routing information is not merely an address of a destination but provides directional information, for example x,y coordinates of a grid to give direction. Thus the nodes do not need routing table(s) or global signals to determine the direction since all the information the node needs is contains in the routing information of the data packets. This enables the circuitry of the node and the interconnection system to be simplified making integrated of the system onto a chip feasible. [0009]
  • If each functional unit is connected to a node, and all nodes are connected together, then a pipeline connection will exist between each pair of nodes in the system. The number of intervening nodes will govern the number of pipeline stages. If there is pair of joined nodes where the distance between them is too great to transmit data within a single clock cycle, a repeater block can be inserted between the nodes. This block registers the data, while maintaining the same protocol as the other bus blocks. The inclusion of the repeater blocks allows interconnection of arbitrary length to be created. [0010]
  • The interconnection system according the present invention can be utilised in an intra-chip interconnection network. Data transfers are all packetized, and the packets may be of any length that is a multiple of the data-path width. The nodes of the bus used to create the interconnection network (nodes and T-switches) all have registers on the data-path(s). [0011]
  • The main advantage of the present invention is that it is inherently re-usable. The implementer need only instantiate enough functional blocks to form an interconnection of the correct length, with the right number of interfaces, and with enough repeater blocks to achieve the desired clock rate. [0012]
  • The interconnection system in accordance with the present invention employs distributed arbitration. The arbitration capability grows as more blocks are added. Therefore, if the bus needs to be lengthened, it is a simple matter of instantiating more nodes and possibly repeaters. Since each module manages its own arbitration within itself, the overall arbitration capability of the interconnect increases. This makes the bus system of the present invention more scalable (in length and overall bandwidth) than other conventional bus systems. [0013]
  • The arbitration adopted by the system of the present invention is truly distributed and ‘localised’. This has been simplified such that there is no polling to see if the downstream route is free as in conventional distributed systems, instead this information is initiated by the ‘blocked node’ and pipelined back up the interconnection (bus) by upstream nodes. [0014]
  • The interconnection in accordance with the present invention is efficient in terms of power consumption. Since packets are routed, rather than broadcast, only the wires between the source and destination node are toggled. The remaining bus drivers are clock-gated. Hence the system of the present invention consumes less power. [0015]
  • Furthermore, every node on the bus has a unique address associated with it; an interface address. A field in the packet is reserved to hold a destination interface address. Each node on the bus will interrogate this field of an incoming packet; if it matches its interface address it will route the packet off the interconnection (or bus), if it does not match it will route the packet down the bus. The addressing scheme could be extended to support “wildcards” for broadcast messages; if a subset of the address matches the interface address then the packet is routed off the bus and passed on down the bus, otherwise it is just sent on down the bus. [0016]
  • For packets coming on to the bus, each interface unit interrogates the destination interface address of the packet. This is used to decide which direction a packet arriving on the bus from an attached unit is to be routed. In the case of a linear bus, this could be a simple comparison: if the destination address is greater than the interface address of the source of the data then the packets routed “up” the bus, otherwise the packet is routed “down” the bus. This could be extended to each interface unit such that each node maps destination addresses, or ranges of addresses, to directions on the bus. [0017]
  • Preferably, the interface unit sets a binary lane signal based on the result of this comparison. In this way functionality is split between the node and interface unit. All “preparation” of the data to be transported (including protocol requirements) is carried out in the interface unit. This allows greater flexibility as the node is unchanging irrespective of the type of data to be transported, allowing the node to be re-used in different circuits. More preferably the node directs the packet off the interconnection system to a functional unit. [0018]
  • More preferably, data destined for the interconnection, the interface unit can carry out the following functions: take the packet from the functional unit, ensure a correct destination module ID, head and tail bit; compare the destination module ID to the local module ID and sets a binary lane signal based on the result of this comparison; pack the module ID, data and any high level (non bus) control signals into a flit; implement any protocol change necessary; and pass the lane signal and flit to the node using the inject protocol. [0019]
  • A T-junction or switch behaves in a similar way; the decision here is simply whether to route the packet down one branch or the other. This would typically be done for ranges of addresses; if the address is larger than some predefined value then the packets are routed left, otherwise they are routed right. However, more complex routing schemes could be implemented if required. [0020]
  • The addressing scheme can be extended to support inter-chip communication. In this case a field in the address is used to define a target chip address with, for example, 0 in this field representing a local address of the chip. When a packet arrives at the chip this field will be compared with the pre-programmed address of the chip. If they match then the field is set to zero and the local routing operates as above. If they do not match, then the packet is routed along the bus to the appropriate inter-chip interface in order to be routed towards its final destination. This scheme could be extended to allow a hierarchical addressing scheme to manage routing across systems, sub-systems, boards, groups of chips, as well as individual chips. [0021]
  • The system according to the present invention is not suitable for all bus-type applications. The source and destination transactors are decoupled, since there is no central arbitation point. The advantage of the approach of the present invention is that long buses (networks) can be constructed, with very high aggregrate bandwidth. [0022]
  • The system of the present invention is protocol agnostic. The interconnection of the present invention merely transports data packets. Interface unit in accordance with the present invention manage all protocol specific features. This means that it easy to migrate to a new protocol, since only the interface units need to be re-designed. [0023]
  • The present invention also provides flexible topology and length. [0024]
  • The repeater blocks of the present invention allow very high clock rates in that the overall modular structure of the interconnection prevents the clock rate being limited by long wires. This simplifies the synthesis and layout. The repeater blocks not only pipeline the data as it goes downstream but they implement a flow control protocol; pipeline blockage information up the interconnection (or bus) (rather than a blocking signal being distributed globally). When blocked, a feature of this mechanism is that data compression (inherent buffering) is achieved on the bus at least double the latency figure, i.e. if latency through the repeater is one cycle then two data control flow digits (flits: the basic unit of data transfer over the interconnection of the present invention. It includes n bytes of data, as well as some side-band control signals. The flow control digit size equals the size of the bus data-path) will concatenate when it is blocked. This means that the scope of any blocking is minimised and thus reducing any queuing requirement in a functional block. [0025]
  • The flow of data flow control digits is managed by a flow control protocol, in conjunction with double-buffering in a store block (and repeater unit) as described previously. [0026]
  • The components of the interconnection of the present invention manage the transportation of packets. Customised interface units handle protocol specific features. These typically involve packing and unpacking of control information and data, and address translation. A customised interface unit can be created to match any specific concurrency. [0027]
  • Many packets can be travelling along separate segments of the interconnection of the present invention simultaneously. This allows the achievable bandwidth to be much higher than the raw bandwidth of the wires (width of bus, multiplied by clock rate). If there are, for example, four adjacent on-chip blocks A, B, C and D, then A and B can communicate at the same time that C and D communicate. In this case the achievable bandwidth is twice that of broadcast-based bus. [0028]
  • Packets are injected (gated) onto the interconnection at each node, so that each node is allocated a certain amount of the overall bandwidth allocation (e.g. by being able to send, say 10 flow control digits within every 100 cycles). this distributed scheme controls the overall bandwidth allocation. [0029]
  • It is possible to keep forcing packets onto the interconnection of the present invention until it saturates. All packets will eventually be delivered. This means the interconnection can be used as a buffer with an in-built flow control mechanism.[0030]
  • BRIEF DESCRIPTION OF DRAWINGS
  • FIG. 1 is a block schematic diagram of the system incorporating the interconnection system according to an embodiment of the present invention; [0031]
  • FIG. 2 is a block schematic diagram illustrating the initiator and target of a virtual component interface system of FIG. 1, [0032]
  • FIG. 3 is a block schematic diagram of the node of the interconnection system shown in FIG. 1; [0033]
  • FIG. 4 is a block schematic diagram of connection over the interconnection according to the present invention between virtual components of the system shown in FIG. 1; [0034]
  • FIG. 5[0035] a is a diagram of the typical structure of the T-switch of FIG. 1;
  • FIG. 5[0036] b is a diagram showing the internal connection of the T-switch of FIG. 5a;
  • FIG. 6 illustrates the Module ID (interface ID) encoding of the system of FIG. 1; [0037]
  • FIG. 7 illustrates handshaking signals in the interconnection system according to an embodiment of the present invention; [0038]
  • FIG. 8 illustrates the blocking behaviour of the interconnection system of an embodiment of the present invention when occup[1:0]=01; [0039]
  • FIG. 9 illustrates blocking for two cycles of the interconnection system according to an embodiment of the present invention: [0040]
  • FIG. 10 illustrates virtual component interface handshake according an embodiment of the present invention; [0041]
  • FIG. 11 illustrates a linear chip arrangement of the system according to an embodiment of the present invention; [0042]
  • FIG. 12 is a schematic block diagram of the interconnection system of the present invention illustrating an alternative topogly; [0043]
  • FIG. 13 is a schematic block diagram of the interconnection system of the present invention illustrating a further alternative topogly; [0044]
  • FIG. 14 illustrates an example of a traffic handling subsystem according to an embodiment of the present invention; [0045]
  • FIG. 15 illustrates a system for locating chips on a virtual grid according to a method of a preferred embodiment of the present invention; and [0046]
  • FIG. 16 illustrates routing a transaction according to the method of a preferred embodiment of the present invention.[0047]
  • DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS
  • The basic mechanism for communicating data and control information between functional blocks is that blocks exchange messages using the [0048] interconnection system 100 according to the present invention. The bus system can be extended to connect blocks in a multi chip system, and the same mechanism works for blocks within a chip or blocks on different chips.
  • An example of a [0049] system 100 incorporating the interconnection system 110 according to an embodiment of the present invention, as shown in FIG. 1, comprises a plurality of reusable on-chip functional blocks or virtual component blocks 105 a, 105 b and 105 c. These functional units interface to the interconnection and can be fixed. They can be re-used at various levels of abstraction (eg. RTL, gate level, GDSII layout data) in different circuit design. The topology can be fixed once the size, aspect ratio and the location of the I/O's to the interconnection are known. Each on-chip functional unit 105 a, 105 b, 105 c are connected to the interconnection system 110 via its interface unit. The interface unit handles address decoding and protocol translation. The on-chip functional block 105 a, for example, is connected to the interconnection system 110 via an associated virtual component interface initiator 115 a and peripheral virtual component interface initiator 120 a.
  • The on-chip [0050] functional block 105 b, for example, is connected to the interconnection system 110 via an associated virtual component interface target 125 b and peripheral virtual component interface target 130 b. The on-chip functional block 105 c, for example, is connected to the interconnection system 110 via an associated virtual component interface initiator 115 c and peripheral virtual component interface target 130 c. The associated initiators and targets for each on-chip functional block shown in FIG. 1 are purely illustrative and may vary depending on the associated block requirements. A functional block may have a number of connections to the interconnection system. Each connection has an advanced virtual component interface (extensions forming a superset of basic virtual component interface. This is the protocol used for the main data interfaces in the system of the present invention) or peripheral virtual component interface interface (low bandwidth interface allowing atomic operations, mainly used in the present invention for control register access).
  • One currently accepted protocol for connecting such on-chip functional units as shown in FIG. 1 to a system interconnection according to the embodiment of the present invention is virtual component interface. Virtual component interface is an OCB standard interface to communicate between a bus and/or virtual component, which is independent of any specific bus or virtual component protocol. [0051]
  • There are three types of virtual component interfaces, peripheral [0052] virtual component interface 120 a, 130 b, 130 c, basic virtual component interface and advanced virtual component interface. The basic virtual component interface is a wider, higher bandwidth interface than the peripheral virtual component interface. The basic virtual component interface allows split transactions. Split transaction are where the request for data and the response are decoupled, so that a request for data does not need to wait for the response to be returned before initiating further transactions. Advanced virtual component interface is a superset of basic virtual component interface; Advanced virtual component interface and peripheral virtual component interface have been adopted in the system according to the embodiment of the present invention.
  • The advanced virtual component interface unit comprises a target and initiator. The target and initiator are virtual components that send request packets and receive response packets. The initiator is the agent that initiates transactions, for example, DMA (or EPU on F150). [0053]
  • As shown in FIG. 2, an interface unit that initiates a read or write transaction is called an initiator [0054] 210 (issues a request 220), while an interface that receives the transaction is called the target 230 (responds to a request 240). This is the standard virtual component terminology.
  • Communication between each no-chip [0055] functional block 105 a, 105 b and 105 c and its associated initiators and targets are made using virtual component interface protocol. Each initiator 115 a, 120 a, 115 c and target 125 b, 130 b, 130 c are connected to a unique node 135, 140, 145, 150, 155 and 160. Communication between each initiator 115 a, 120 a, 115 c and target 125 b, 130 b, 130 c uses the protocol in accordance with the embodiment of the present invention and as described in more detail below.
  • The [0056] interconnection system 110 according an embodiment of the present invention comprises three separate buses 165, 170 and 175. The RTL components have parameterisable widths, so these may be three instances of different width. An example might be a 64-bit wide peripheral virtual component bus 170 (32 bit address+32 data bits), a 128-bit advanced virtual component interface bus 165, and a 256-bit advanced virtual component interface bus 175. Although three separate buses are illustrated here, it is appreciated that the interconnection system of the present invention may incorporate any number of separate buses.
  • At regular intervals along the bus length a [0057] repeater unit 180 may be inserted for all the buses 165, 170 and 175. There is no restriction on the length of the buses 165, 170 and 175. Variations in the length of the buses 165, 170 and 170 would merely require an increased number of repeater units 180. Repeater units would of course only be required when the timing contraints between two nodes cannot be met due to the length of wire of the interconnection.
  • For complex topologies, T-switches (3-way connectors or the like) [0058] 185 can be provided. The interconnection system of the present invention can be used in any topology but care should be taken when the topology contains loops as deadlock may result.
  • Data is transferred on the interconnection network of the present invention in packets. The packets may be of any length that is a multiple of the data-path width. The [0059] nodes 135, 140, 145, 150, 155 and 160 according to the present invention used to create the interconnection network (node and T-switch) all have registers on the data-path(s).
  • Each interface unit is connected to a node within the interconnection system itself, and therefore to one particular lane of the bus. Connections may be of initiator or target type, but not both—following from the conventions of virtual component interface. In practise every block is likely to have a peripheral virtual component interface target interface for configuration and control. [0060]
  • The bus components according to the embodiment of the present invention use distributed arbitration, where each block in the bus system manages access to its own resources. [0061]
  • A [0062] node 135 according to the embodiment of the present invention is illustrated in FIG. 3. Each node 135, 140, 145, 150, 155 and 160 are substantially similar. Node 135 is connected to the bus 175 of FIG. 1. Each node comprises a first and second input store 315, 320. The first input store 315 has an input connected to a first bus lane 305. The second input store 320 has an input connected to a second bus lane 310. The output of the first input store 315 is connected to a third bus lane 306 and the output of the second input store 320 is connected to a fourth bus lane 311. Each node further comprises an inject control unit 335 and a consume control unit 325. The node may not require the consume arbitration, for example the node may have an output for each uni-directional lane but the consume handshaking retained. The input of the inject control unit 335 is connected to the output of an interface unit of the respective functional unit for that node. The outputs of the inject control unit 335 are connected to a fifth bus lane 307 arid sixth bus lane 312. The input of the consume control unit 325 is connected to the output of a multiplexer 321. The inputs of the multiplexer 321 are connected to the fourth bus lane 311 and the third bus lane 306. The output of the consume control unit 325 is connected to a bus 330 which is connected to the interface unit of the respective functional unit for that node. The fifth bus lane 307 and the third bus lane 306 are connected to the inputs of a multiplexer 308. The output of the multiplexer 308 is connected to the first bus lane 305. The fourth bus lane 311 and the sixth bus lane 312 are connected to the inputs of a multiplexer 313. The output of the multiplexer 313 is connected to the second bus lane 310.
  • The nodes are the connection points where data leaves or enters the bus. It also forms part of the transport medium. The node forms part of the bus lane which it connects to, including both directions of data path. The node conveys data on the lane to which i connects, with one cycle of latency when not blocked. It also allows the connecting functional block to inject and consume data in either direction, via its interface unit. Arbitration of injected or passing data is performed entirely within the node. Internally, [0063] bus 175 consists of a first lane 305 and a second lane 310. The first and seond lanes 305 and 310 are physically separate unidirectional buses that are multiplexed and de-multiplexed to the same interfaces within the node 135. As illustrated in FIG. 3, the direction of data flow of the first lane 305 is in the opposite direction to that of the second lane 310. Each lane 305 and 310 has a lane number. The lane number is a parameter that is passed from the interface unit to the node to determine which lane (and hence which direction) each packet is sent to. Of course it is appreciated that the direction of the data flow of the first and second lanes 305 and 310 can be in the same direction. This would be desirable if the blocks transacting on the bus only need to send packets in one direction.
  • The [0064] node 135 is capable of concurrently receiving and injecting data on the same bus lane. At the same time it is possible to pass data through on the other lane. Each uni-directional lane 305, 310 carries a separate stream 306, 307, 311, 312 of data. These streams 306, 307, 311, 312 are multiplexed together at the point 321 where data leaves the node 135 into the on-chip module 105 a (not shown here) via the interface unit 115 a and 120 a (not shown here). The data streams 306, 307, 311, 312 are de-multiplexed from the on-chip block 105 a onto the bus lanes 305 and 310 in order to place data on the interconnection 110.
  • This is an example of local arbitration, where competition for resources is resolved in the [0065] block 105 a where those resources reside. In this case, it is competition for access to bus lanes, and for access to the single lane coming off the bus. This approach of using local arbitration is used throughout the interconnection system, and is key to its scalability. An alternative would be that both output buses come from the node to the functional unit and then the arbitration mux would not be needed.
  • Each lane can independently block or pass data through. Data can be consumed from one lane at a time, and injected on one lane at the same time. Concurrent inject and consume on the same lane is also permitted. Which lane each packet is injected on is determined within the interface unit. [0066]
  • Each input store (or register) [0067] 315 and 320 registers the data as it passes from node to node. Each store 315, 320 contains two flit-wide registers. When there is no competition for bus resources, only one of the registers is used. When the bus blocks, both registers are then used. It also implements the ‘block on header’ feature. This is needed to allow packets to be blocked at the header flit so that a new packet can be injected onto the bus.
  • The [0068] output interface unit 321, 325 multiplexes both bus lanes 305, 310 onto one lane 330 that feeds into the on-chip functional unit 105 a via the interface unit which is connected to the node 135. The output interface unit 321, 325 also performs an arbitration function, granting one lane access to the on-chip functional unit, while blocking the other. Each node also comprises an input interface unit 335. The input interface unit 335 performs de-multiplexing of packets onto one of the bus lanes 305, 310. It also performs an arbitration function, blocking the packet that is being input until the requested lane is available.
  • A plurality of [0069] repeater units 180 are provided at intervals along the length of the interconnection 110. Each repeater unit 180 is used to introduce extra registers on the data path. It adds an extra cycle of latency, but is only used where there is a difficulty meeting timing constraints. Each repeater unit 180 comprises a store similar to the store unit of the nodes. The store unit merely passes data onwards, and implements blocking behaviour. There is no switching carried out in the repeater unit. The repeater block allows for more freedom in chip layout. For example, it allows long length of wires between nodes or where a block has a single node to connect to a single lane, repeaters may be inserted into the other lanes in order to produce uniform timing characteristics over all lanes. There may be more than one repeater between two nodes.
  • The system according to the embodiment of the present invention is protocol agnostic, that is to say, the data-transport blocks such as the [0070] nodes 135, 140, 145, 150, 155, 160, repeater units 180 and T-switch 185 simply route data packets from a source interface to a destination interface. Each packet will contain control information and data. The packing and unpacking of this information is performed in the interface units 115 a, 120 a, 125 b, 130 b, 115 c, 130 c. In respect of the preferred embodiment, these interface units are virtual component interfaces, but it is appreciated that any other protocol could be supported by creating customised interface units.
  • A large on-chip block may have several interfaces to the same bus. [0071]
  • The target and [0072] initiator 15 a, 120 a, 125 b, 130 b, 115 c, 130 c of the interface units perform conversion between the advanced virtual component interface and bus protocols in the initiator and from the bus to advanced virtual component interface in the target. The protocol is an asynchronous handshake on the advanced virtual component interface side illustrated in FIG. 10. The interface unit initiator comprises a send path. This path performs conversion between the advanced virtual component interface communication protocol to the bus protocol. It extracts a destination module ID or interface ID. Also, a block may be connected to several buses, with a different module (interface) ID on each bus address, packs it into the correct part of the packet, and uses the module ID in conjunction with a hardwired routing table to generate a lane number (e.g. 1 for right, 0 for left). The initiator blocks the data at the advanced virtual component interface when it cannot be sent onto the bus. The interface unit initiator also comprises a response path. The response path receives previously requested data, converting from bus communication protocol to the virtual component interface protocol. It blocks data on the bus if the on-chip virtual component block is unable to receive it.
  • The interface unit target comprises a send path which receives incoming read and write requests. The target converts from bus communication protocol to advanced virtual component interface protocol. It blocks data on the bus if it cannot be accepted across the virtual component interface. The target also comprises a response path which carries read (and for verification purposes, write) requests. It converts advanced virtual component interface communication protocol to bus protocol and blocks data at the advanced virtual component interface if it cannot be sent onto the bus. [0073]
  • The other type of interface unit utilised in the embodiment of the present invention is a peripheral virtual component unit. The main differences between the peripheral virtual component interface and the advanced virtual component interface are the data interface of the peripheral virtual component interface is potentially narrower (up to 4 bytes) than the advanced virtual component interface and the peripheral virtual component interface is not split transaction. [0074]
  • The peripheral virtual component interface units perform conversion between the peripheral virtual component interface and bus protocols in the initiator, and from the bus protocol to peripheral virtual component interface protocol in the target. The protocol is an asynchronous handshake on the peripheral virtual component interface side. [0075]
  • The interface unit initiator comprises a send path. It generates destination module ID and the transport lane number from memory address. The initiator blocks the data at the peripheral virtual component interface when it cannot be sent onto the bus. The initiator also comprises a response path. This path receives previously requested data, converting from bus communication protocol to the peripheral virtual component interface protocol. It also blocks data on the bus if the on-chip block (virtual component block) is unable to receive it. [0076]
  • The peripheral virtual component interface unit target comprises a send path which receives incoming read and write requests. It blocks data on the bus if it cannot be accepted across the virtual component interface. The target also comprises a response path which carries read (and for verification purposes, write) requests. It converts peripheral virtual component interface communication protocol to bus protocol and blocks data at the virtual component interface if it cannot be sent onto the bus. [0077]
  • The peripheral virtual component interface initiator may comprise a combined initiator and target. This is so that the debug registers (for example) of an initiator can be read from. [0078]
  • With reference to FIG. 4, the virtual component (on-chip) blocks can be connected to each other over the interconnection system according to the present invention. A first virtual component (on-chip) block [0079] 425 is connected point to point to an interface unit target 430. The interface unit target 430 presents a virtual component initiator interface 440 to the virtual component target 445 of on-chip block 425. The interface unit target 430 uses a bus protocol conversion unit 448 to interface to the bus interconnect 450. The interface unit initiator 460 presents a target interface 470 to the initiator 457 of the second on-chip block 455 and, again, uses a bus protocol conversion unit 468 on the other side.
  • The T-[0080] switch 185 of FIG. 1 is a block that joins 3 nodes, allowing more complex interconnects than simple linear ones. At each input port the interface ID of each packet is decoded and translated into a single bit, which represents the two possible outgoing ports. A hardwired table inside the T-Switch performs this decoding. There is one such table for each input port on the T-Switch. Arbitration takes place for the output ports if there is a conflict. The winner may send the current packet, but must yield when the packet has been sent. FIGS. 5a and 5 b show an example of the structure of a T-switch.
  • The T-switch comprises three sets of input/[0081] output ports 505, 510, 515 connected to each pair of unidirectional bus lanes 520, 525, 530. Within the T-switch, a T- junction 535, 540, 545 is provided for each pair of bus lanes 520, 525, 530 such that an incoming bus 520 coming into an input port 515 can be output via output port 505 or 510, for example.
  • Packets do not change lanes at any point on the bus, so the T-switch can be viewed as a set of n 3-way switches, where n is the number of uni-directional bus lanes. The T-[0082] switch 185 comprises a lane selection unit. The lane selection unit takes in module ID of incoming packets and produces a 1-bit result corresponding to the two possible output ports on the switch. The T-switch also comprises a store block on each input lane. Each store block stores data flow control digits and allows them to block in place if the output port is temporarily unable to receive. It also performs a block on header function, which allows switching to occur at the packet level (rather than the flow control digit level). The T-switch also includes an arbiter for arbitration between requests to use output ports.
  • During initialisation, the interconnection system according to the embodiment of the present invention powers up into a usable state. Routing information is hardcoded into the bus components. A destination module interface ID (mod ID) for example as illustrated in FIG. 6 is all that is required to route a packet to another node. In order for that node to return a response packet, it must have been sent the module interface ID of the sender. [0083]
  • There may be more than one interconnection in a processing system. On each bus, every interface (which includes an inject and consume port) has a unique ID. These ID's are hard-coded at silicon compile-time. [0084]
  • Units attached to the bus (on-chip blocks) are free to start communicating straight after reset. The interface unit will hold off communications (by not acknowledging them) until it is ready to begin operation. [0085]
  • The interconnection system according to the present invention has an internal protocol that is used throughout. At the interfaces to the on-chip blocks this may be converted to some other protocol, for example virtual component interface as described above. The internal protocol will be referred to as the bus protocol This bus protocol allows single cycle latency for packets travelling along the bus when there is no contention for resources, and to allow packets to block in place when contention occurs. [0086]
  • The bus protocol is used for all internal (non interface/virtual component interface) data transfers. It consists of five signals: [0087] occup 705, head 710, tail 715, data 720 and valid 725 between a sender 735 and a receiver 730. These are shown in FIG. 7.
  • The packets consist of one or more flow control digits. On each cycle that the sender asserts the valid signal, the receiver must accept the data on the next positive clock edge. [0088]
  • The [0089] receiver 730 informs the sender 735 about its current status using the occup signal 705. This is a two-bit wide signal.
    TABLE I
    Occup signal values and their meaning.
    Occup [1:0] Meaning
    00 Receiver is empty - can send data.
    01 Receiver has one flow control digit - if
    sending a flow control digit on this cycle,
    don't send a flow control digit on the next
    cycle.
    10 The Receiver is full. Don't send any flow
    control digits until Occup decreases.
    11 Unused.
  • The [0090] occup signal 705 tells the sender 735 if and when it is able to send data. When the sender 735 is allowed to transmit a data flow control digit, it is qualified with a valid signal 725.
  • The first flow control digit in each packet is marked by head=‘1’. The last flow control digit is marked by tail=‘1’. A single flow control digit packet has signals head=tail=valid=‘1’. Each node and T-Switch use these signals to perform switching at the packet level. [0091]
  • FIG. 8 shows an example of blocking behaviour on the interconnect system according to an embodiment of the present invention. The occup signal is set to ‘1’, meaning ‘if sending a flow control digit this cycle, don't send one on the next cycle’. [0092]
  • FIG. 9 shows an example of the blocking mechanism more completely. The occup signal is set to 01 (binary), then to 10 (binary). The sender can resume transmitting flow control digits when the occup signal is set back to 01—at that point it is not currently sending a flow control digit, so it is able to send one on the next cycle. [0093]
  • The protocol at the boundary between the node and the interface unit is different from that just described, and is similar to that used by the virtual component interface. At the sending and receiving interfaces, there is a val and an ack signal. When val=ack=1, a flow control digit is exchanged for the inject protocol. The consume (bus output) protocol is different to the inject protocol but is the minimum logic to allow registered outputs (and thus simplifies synthesis and integration into a system on chip). The consume protocol is defined as: on the rising clock edge, data is blocked on the next clock edge if CON_BLOCK=1; on the rising clock edge, data is unblocked on the next clock edge if CON_BLOCK=0; CON_BLOCK is the flow control (blocking) signal from the functional unit. [0094]
  • Of course the protocols at this interface can be varied and not effect the overall operation of the bus. [0095]
  • The difference between this and virtual component interface is that the ack signal is high by default, and is only asserted low on a cycle when data cannot be received. Without this restriction, the node would need additional banks of registers. [0096]
  • The bus protocol allows exchange packets consisting of one or more flow control digits. Eight bits in the upper part of the first packet carry the destination module ID, and are used by the bus system to deliver the packet. The top 2 bits are also used for internal bus purposes. In all other bit fields, the packing of the flow control digits is independent of the bus system. [0097]
  • At each interface unit, virtual component interface protocol is used. The interface control and data fields are packed into bus flow control digits by the sending interface and then unpacked at the receiving interface unit. The main, high-bandwidth, interface to the bus uses the advanced virtual component interface. All features of the advanced virtual component interface are implemented, with the exception of those used to optimise the internal operation of an OCB. [0098]
  • The virtual component interface protocol uses an asynchronous handshake as shown in FIG. 8. Data is valid when VAL=ACK=1. The bus interface converts data and control information from the virtual component interface protocol to the bus internal communication protocol. [0099]
  • The bus system does not distinguish between control information and data. Instead, the control bits and data are packed up into packets and sent to the destination interface unit, where they are unpacked and separated back into data and control. [0100]
  • Although in the preferred embodiment, virtual component interface compliant interface units are utilised, it is appreciated that different interface units may be used instead (e.g. ARM AMBA compliant interfaces). [0101]
  • Table II shows the fields within the data flow control digits that are used by the interconnection system according to an embdoiemnt of the present invention. All other information in the flow control digits is simply transported by the bus. The encoding and decoding is performed by the interface units. The interface units also insert the head and tail bits into the flow control digits, and insert the MOD ID in the correct bit fields. [0102]
    TABLE II
    Specific fields.
    Name Bit Comments
    Head FLOW CONTROL Set to ‘1’ to indicate first flow control
    DIGIT_WIDTH - 1 digit of packet.
    Tail FLOW CONTROL Set to‘1’ to indicate last flow control digit
    DIGIT_WIDTH - 2 of packet.
    Mod FLOW CONTROL ID of interface to which packet is to be
    ID DIGIT_WIDTH - 3: sent. Virtual component interface calls
    FLOW CONTROL this MOD ID. It is really an interface ID,
    DIGIT_WIDTH - 10 since a large functional unit could have
    multiple bus interfaces, in which case, it is
    necessary to distinguish between them.
  • The advanced virtual component interface packet types are read request, write request and read response. A read request is a single flow control digit packet and all of the relevant virtual component interface control fields are packed into the flow control digit. A write request consists of two or more flow control digits. The first flow control digit contains virtual component interface control information (e.g. address). The subsequent flow control digits contain data and byte enables. A read response consists of one or more flow control digits. The first and subsequent flow control digits all contain data plus virtual component interface response fields (e.g. RSCRID, RTRDID and RPKTID). [0103]
  • An example mapping of the advanced virtual component interface onto packets is now described. The example is for a bus with 128-bit wide data paths. It should be noted that the nodes extract the destination module ID from bits [0104] 159:152 in the first flow control digit of each packet. In the case of read response packets this corresponds with the virtual component interface RSCRID field.
    TABLE III
    Possible Virtual Component Interface fields for 128 bit wide
    bus
    Header
    AVCI/ Flow Read
    BVCI control Res-
    Signal digit ponses Di-
    Name WIDTH Bits Only? Only? rection Comments
    CLOCK
    1 IA
    RESETN
    1 IA
    CMDACK
    1 TI Handshake
    signal.
    CMDVAL 1 IT Handshake
    signal.
    WDATA 128 127:0 IT Only for
    write
    requests.
    BE 16 143:128 IT Only for
    write
    requests.
    AD- 64  63:0 {haeck over (o)} IT
    DRESS
    CFIXED
    1 64 {haeck over (o)} IT
    CLEN 8  72:65 {haeck over (o)} IT ***needs
    update***
    CMD 2  75:74 {haeck over (o)} IT
    CONTIG
    1 76 {haeck over (o)} IT
    EOP
    1 IT Handshake
    signal.
    CONST 1 77 {haeck over (o)} IT
    PLEN 9  86:78 {haeck over (o)} IT
    WRAP
    1 87 {haeck over (o)} IT
    RSPACK
    1 IT Handshake
    signal.
    RSPVAL 1 TI Handshake
    signal.
    RDATA 128 127:0 {haeck over (o)} TI Only for
    read
    responses.
    REOP 1 TI Handshake
    signal.
    RERROR 2 143:142 {haeck over (o)} TI Only for
    read
    responses.
    DEFD 1 88 {haeck over (o)} IT
    WRPLEN 5  93:89 {haeck over (o)} IT
    RFLAG
    4 141:138 {haeck over (o)} TI Only for
    read
    responses.
    SCRID 8 151:144 {haeck over (o)} IT
    TRDID
    2  95:94 {haeck over (o)} IT
    PKTID 8 103:96 {haeck over (o)} IT
    RSCRID 8 159:152 {haeck over (o)} TI Only for
    read
    responses.
    RTRDID 2 137:136 {haeck over (o)} TI Only for
    read
    responses.
    RPKTID 8 135:128 {haeck over (o)} TI Only for
    read
    responses.
  • Peripheral virtual component interface burst-mode read and write transactions are not supported over the bus, as these cannot be efficiently implemented. For the reason, the peripheral virtual component interface EOP signal should be fixed at logic ‘1’. Any additional processing unit or extenal units can be attached to the bus, but the EOP signal should again be fixed at logic ‘1’. With this change, the new unit should work normally. [0105]
  • The read request type is a single flow control digit packet carrying the 32-bit address of the data to be read. The read response is a single flow control digit response containing the requested 32 bits of data. The write request is a single flow control digit packet containing the 32-bit address of the location to be written, plus the 32 bits of data, and 4 bits of byte enable. The write response prevents a target responding to a write request in the same way that it would to a read request. [0106]
  • With all of the additional signals, 32 bit (data) peripheral virtual component interface occupies 69 bits on the bus. [0107]
    TABLE IV
    PVCI
    Signal Bit Read Read Write
    Name Fields Request Response Request Comments
    CLOCK System signal.
    RESETN System signal.
    VAL Handshake signal.
    ACK Handshake signal.
    EOP Handshake signal.
    ADDRESS 63:32 {haeck over (o)} {haeck over (o)}
    RD 100 {haeck over (o)} {haeck over (o)}
    BE 67:64 {haeck over (o)}
    WDATA 31:0  {haeck over (o)}
    RDATA 31:0  {haeck over (o)}
    RERROR  68 {haeck over (o)}
  • The internal addressing mechanism of the bus is based on the assumption that all on-chip blocks in the system have a fixed 8-bit module ID. [0108]
  • Virtual component interface specifies the use of a single global address space. Internally the bus delivers packets based on the module ID of each block in the system. The module ID is 8 bits wide. All global addresses will contain the 8 bits module ID, and the interface unit will simply extract the destination module ID from the address. The location of the module ID bits within the address is predetermined. [0109]
  • The module IDs in each system are divided into groups. Each group may contain up to 16 modules. The T-switches in the system use the group ID to determine which output port to send each packet to. [0110]
  • Within each group, there may be up to sixteen on-chip blocks, each with a unique subID. The inclusion of only sixteen modules within each group does not restrict the bus topology. Within each linear bus section, there may be more than one group, but modules from different groups may not interleave. There may be more than sixteen modules between T-switches. The only purpose of using a group ID and sub ID is to simplify the routing tables inside the T-switch(es). If there are no T-switches being used, the numbering of modules can be arbitrary, If a linear bus topology is used and the interfaces are numbered sequentially, this may simplify lane number generation, as a comparator can be used instead of a table. However, a table may still turn out to be smaller after logic minimisation. Two interfaces on different buses can have the same mod ID (=interface ID). [0111]
  • An example to reduce erroneous traffic on the interconnection according to the embodiment of the present invention is described here. When packets that do not have a legal mod ID are presented to the interface unit, it will acknowledge them, but will also generate an error in the rerror virtual component interface field. The packet will not be sent onto the bus. The interface unit will “absorb” it and destroy it. [0112]
  • In the preferred embodiment the bus system blocks operate at the same clock rate as synthesisable RTL blocks in any given process. For a 0.13 μm 40 G processor, the target clock rate is 400 MHz. There will be three separate buses. Each bus cpmprises a separate parameterised component. There will be a 64-bit wide peripheral virtual component interface bus connecting to all functional units on the chip. There will be two advanced virtual component interface buses, one with a 128-bit data-path (raw bandwidth 51.2 Gbits/sec on each unidirectional lane), the other with a 256-bit data-path (raw bandwidth 102.4 Gbits/sec on each unidirectional lane). Not all of this bandwidth can be fully utilised due to the overhead of control and request packets, and it is not always possible to achieve efficient packing of data into flow control digits. Some increase in bandwidth will be seen due to concurrent data transfers on the bus, but this can only be determined in system simulations. [0113]
  • In the embodiment, latency of packets on the bus is one cycle at the sending interface unit, one cycle per bus block (nodes and repeaters) that data passes through, one or two additional cycle(s) at the node consume unit, one cycle at receiving interface unit and n−1 cycles, where n is the packet length in flow control digits. This figure gives the latency for the entire data transfer, meaning that latency increases with packet size. [0114]
  • It is possible to control bandwidth allocation, by introducing programmability into the injection controller in the node. [0115]
  • It is not possible for packets to be switched between bus lanes. Once a packet has entered the bus system, it stays on the same bus lane, until it is removed at the destination interface unit. Packets on the bus are never allowed to interleave. It is not necessary to inject new flow control digits on every clock cycle. In other words, it is possible to have gap cycles. These gaps will remain in the packet when it is inside the bus system, and will waste bandwidth. These packets will remain in the packet when it is in the bus system and unblocked, thus wasting bandwidth. If blocked then only validated data will concatenate and thus any intermediate non valid data will be removed. In addition to help minimise the number of gap packets, it is necessary to ensure that enough FIFO buffering is provided to allow the block to keep injecting flow control digits until each packet has been completely sent, or design the block in a manner that does not cause gaps to occur. [0116]
  • In the system according to the present invention, the length of the packets (in flow control digits) is unlimited. However, consideration should be taken with excessively large packets, as they will utilises a greater amount of the bus resource. Long packets can be more efficient than a number of shorter ones, due to the overhead of having a header flow control digit. [0117]
  • The interconnection system according to the present invention does not guarantee that requested data items would be returned to a module in the order in which they were requested. The block is responsible for re-ordering packets if the order matters. This is achieved using the advanced virtual component interface pktid field, which is used to tag and reorder outstanding transactions. It cannot be assumed that data will arrive at the on-chip block in the same order that it was requested from other blocks in the system. Where ordering is important, the receiving on-chip block must be able to re-order the packets. Failure to adhere to this rule is likely to result in system deadlock. [0118]
  • The interconnection system according to the present invention offers considerable flexibility in the choice of interconnect topology. However, it is currently not advisable to have loops in the topology as these will introduce the possibility of deadlock. [0119]
  • However, it should be possible to program routing tables in a deadlock-free manner if loops were to be used. This would require some method of proving deadlock freedom, together with software to implement the necessary checks. [0120]
  • A further advantage of the interconnection system according to the present invention is that saturating the bus with packets will not cause it to fail. The packets will be delivered eventually, but the average latency will increase significantly. If the congestion is reduced to below the maximum throughput, then it will return to “normal operation”. [0121]
  • In this repsect the following rules should be considered, namely there should be no loops in the bus topology, on-chip blocks must not depend on transactions being returned in order and where latency is important, and is multiple transactors need to use the same bus segments, there should be a maximum packet size. As mentioned above, if loops are required in the future, some deadlock prevention strategy must exist. Ideally this will include a formal proof. Further, if ordering is important, the blocks must be able to re-order the transaction. Two transactions from the same target, travelling on the same lane will be returned in the same order in which the target sent them. If requests were made to two different targets, the ordering is non-deterministic. [0122]
  • Most of the interconnection components of the present invention will involve straightforward RTL synthesis followed by place & route. The interface units may be incorporated into either the on-chip blocks or an area reserved for the bus, depending on the requirements of the overall design, for example there may be a lot of area free under the bus and so using this area rather than adding the functional block area would make more sense as it would reduce the overall chip area. [0123]
  • The interconnection system forms the basis of a system-on-chip platform. In order to accelerate the process of putting a system on-chip together, it has been proposed that the nodes contain the necessary “hooks” to handle distribution of the system clock reset signals. Looking first at an individual chip, the routing of transactions and responses between initiator and target is performed by the interface blocks that connect to the interconnection system, and any intervening T-switch elements in the system. Addressing of blocks is hardwired and geographic, and the routing information is compiled into the interface and T-switch logic at chip integration time. The platform requires some modularity at the chip level as well as the block level on chips. Therefore, knowledge of what other chips or their innards they are connected to can not be hard-coded in the chips themselves, as this may vary on different line cards. [0124]
  • However, with the present invention, it is possible to provide flexibility of chip arrangement with hard-wired routing information by giving each chip some simple rules and designing the topology and enumeration of the chips to support this. This has the dual benefit of simplicity and of being a natural extension to the routing mechanisms within chips themselves. [0125]
  • FIG. 11 illustrates an example of a linear chip arrangement. Of course, it is appreciated that different topologies can be realised according to the present invention. In such a linear arrangement, it is easy to number the chips [0126] 1100(0) to (2) etc sequentially so that any block in any chip knows that a transaction must be routed “up” or “down” the interconnection 1110 to reach its destination, as indicated in the chip ID-field of the physical address. This is exactly the same process as the block performs to route to another block on the same chip. In this case a 2-level decision is utilised. If in ‘present chip’ then route on Block ID, else route on Chip ID.
  • An alternative topology is shown in FIG. 12. It comprises a [0127] first bus lane 1201 and a second bus lane 1202 arranged in parallel. The first and second bus lane correspond to the interconnection system of the embodiment of the present invention. A plurality of multi threaded array processors (MTAPs) 1210 are connected across the two bus lanes 1201, 1202. An network input device 1220, a collector device 1230, a distributor device 1240, an network output device 1250 are connected to the second bus lane 1202 and a table lookup engine 1260 is connected to the first bus lane 1201. Details of operation of the devices connected to the bus lanes is not provided here.
  • As illustarted in FIG. 12, in an alternative topology, the first bus lane [0128] 1201 (256 bits wide, for example) is dedicated to fast path packet data. A second bus lane 1202 (128 bits wide, for example) is used for general non-packet data, such as table lookups, instruction fetching, external memory access, etc. Blocks accessing bus lanes 1201, 1202 use AVCI protocol. An additional bus lane may be used (not shown here for reading and writing block configuration and status registers. Blocks accessing this lane use the PVCI protocol.
  • More generally, where the blocks are connected to the interconnection and which lane or lanes they use can be selected. Allowance for floor planning constraints must obviously be taken into account. Blocks can have multiple bus interfaces, for example. Lane widths can be configured to meet the bandwidth requirements of the system. [0129]
  • Since the interconnection system of the present invention uses point-to-point connections between interfaces, and uses distributed arbitration, it is possible to have several pairs of functional blocks communicating simultaneously without any contention or interference. Traffic between blocks can only interfere if that traffic travels along a shared bus segment in the same direction. This situation can be avoided by choosing a suitable layout. Thus, bus contention can be avoided in the fast path packet flow. This is important to achieve predictable and reliable performance, and to avoid overprovisioning the interconnection. [0130]
  • The example above, avoids bus contention in the fast path, because the packet data flows left to right on [0131] bus lane 1201 via NIP-DIS-MTAP-COL-NOP. Since packets do not cross any bus segment more than once, there is no bus contention. There is no interference between the MTAP processors, because only one at a time is sending or receiving. Another way to avoid bus contention is to place the MTAP processors on a “spur” off the main data path, as shown in FIG. 13.
  • This topology uses a T-[0132] junction 1305 to exploit the fact that traffic going in opposite directions on the same bus segment 1300 is non-interfering. Using the T-junction block 1305 may ease the design of the bus topology to account for layout and floor planning constraints.
  • At the lowest (hardware) level of abstraction, the interconnection system of the present invention preferrably supports advanced virtual component interface transactions, which are simply variable size messages as defined in the virtual component interface standard, sent from an initiator interface to a target interface, possibly followed by a response at a later time. Because the response may be delayed, this is called a split transaction in the virtual component interface system. The network processing system architecture defines two higher levels of abstraction in the inter-block communication protocol, the chunk and the abstract datagram (frequently simply called a datagram). A chunk is a logical entity that represents a fairly small amount of data to be transferred from one block to another. An abstract datagram is a logical entity that represents the natural unit of data for the application. In network processing applications, abstract datagrams almost always correspond to network datagrams or packets. The distinction is made to allow for using the architecture blocks in other applications besides networking. Chunks are somewhat analogous to CSIX C frames, and are used for similar purposes, that is, to have a convenient, small unit of data transfer. Chunks have a defined maximum size, typically about 512 bytes, while datagrams can be much larger, typically up to 9K bytes; the exact size limits are configurable. When a datagram needs to be transferred from one block to another, the actual transfer is done by sending a sequence of chunks. The chunks are packaged within a series of AVCI transactions at the bus interface. [0133]
  • The system addressing scheme according to the embodiment of the present invention will now be described in more detail. The system according to an embodiment of the present invention may span a subsystem that is implemented in more than one chip. [0134]
  • Looking first at an individual chip, the routing of transactions and responses between initiators and targets is performed by the interface blocks that connect to the interconnection itself, and the intervening T-switch elements in the interconnection. Addressing according to an embodiment of the present invention of the blocks is hardwired and geographic, and the routing information is compiled into the interface units, T-switch and node elements logic at chip integration. [0135]
  • The interface ID, occupies the upper part of the physical 64 bit address, the lower bits being the offset within the block. Additional physical bits are reserved for the chip ID to support multi-chip expanses. [0136]
  • The platform according to the embodiment of the present invention requires some modularity at the chip level as well as the block level on chips, knowledge of what other chips or their innards they are connected to can not be hard-coded, as this may vary on different line cards. This prevents the use of the same hard-wired bus routing information scheme as exists in the interface units for transactions within one chip. [0137]
  • However, it is possible to provide flexibility of chip arrangement with hardwired routing information by giving each chip some simple rules and designing the topology and enumeration of chips to support this. This has the dual benefits of simplicity and of being a natural extension to the routing mechanisms within the chips themselves. [0138]
  • An example of a traffic handler subsystem in which the packet queue memory is implemented around two memory hub chips is shown in FIGS. 14 and 15. [0139]
  • In the example, the four chips have four connections to other chips. This results in possible ambiguities about the route that a transaction takes from one chip to the next. Therefore, it is necessary to control the flow of transactions by configuring the hardware and software appropriately, but without having to include: programmable routing functions in the interconnection. [0140]
  • This is achieved by making the chip ID an x,y coordinate instead of a single number. For example, chip ID for [0141] chip 1401 may be 4,2, chip 1403 may be 5,1, chip 1404 may be 5,3 and chip 1402 may be 6,2. Simple, hardwired rules are applied on about how to route the next hop of a transaction destined for another chip. Thus, locating the chips on a virtual “grid” such that the local rules produce the transaction flows desired. The grid can be “distorted” by leaving gaps or dislocations to achieve the desired effect.
  • Each chip has complete knowledge of itself, including how many external ports it has and their assignments to N,S,E,W compass points. This knowledge is wired into the inetrface units and T-switches. A chip has no knowledge at all of what other chips are connected to it, or their x,y coordinates. [0142]
  • The local rules at each chip are this: [0143]
  • 1. A transaction is routed out on the interface that forms the least angle with its relative grid location. [0144]
  • 2. In the event of a tie, N-S interfaces are favoured over E-W. [0145]
  • Applying these rules to the four chip example above, transactions along the main horizontal axis through Arrivals & [0146] Dispatch chips 1401, 1402 are simply routed using up/down on the x coordinate, with y=2.
  • Transactions from Arrivals or [0147] Dispatch 1401, 1402 to one of the memory hubs 1403, 1404 have an angle of 45 degrees, and the 2nd rule applies to route these N-S and not E-W.
  • Responses from [0148] memory hubs 1403, 1404 to any other chip have a clear E/W choice because no other chip has x=5.
  • The conjecture is that there is no chip topology or transaction flow that cannot be expressed by suitable choice of chip coordinates and application of the above rules. [0149]
  • Although a preferred embodiment of the method and system of the present invention has been illustrated in the accompanying drawings and described in the forgoing detailed description, it will be understood that the invention is not limited to the embodiment disclosed, but is capable of numerous variations, modifications without departing from the scope of the invention as set out in the following claims. [0150]

Claims (30)

1. An interconnection system for connecting a plurality of functional units, the interconnection system comprising a plurality of nodes, each node communicating with a functional unit, the interconnection system transporting a plurality of data packets between functional units, each data packet has routing information associated therewith to enable a node to direct the data packet via the interconnection system.
2. An interconnection system for interconnecting a plurality of functional units and transporting a plurality of data packets, the interconnection system comprising a plurality of nodes, each node communicating with a functional unit wherein, during transportation of the data packets between a first node and a second node, only the portion of the interconnection system between the first node and the second node is activated.
3. An interconnection system for interconnecting a plurality of functional units, each functional unit connected to the interconnection system via an interface unit, the interconnection system comprising a plurality of nodes, the interconnection system transporting a plurality of data packets wherein each interface unit translate the protocol for transporting the data packets and the protocol of the functional units.
4. An interconnection system for interconnecting a plurality of functional units and transporting a plurality of data packets between the functional units wherein arbitration is distributed to each functional unit.
5. An interconnection system according to any one of claims 2 to 4, wherein each data packet has routing information associated therewith to enable a node to direct the data packet via the interconnection system.
6. An interconnection system according to any one of claims 1, 3 or 4, wherein, during transportation of a data packet between a first node and a second node, only the portion of the interconnection system between the first node and the second node is activated.
7. An interconnection system according to claims 1, 2 or 4, wherein the interconnection system is protocol agnostic.
8. An interconnection system according to any one of claims 1 to 3, wherein arbitration is distributed between the functional units.
9. An interconnection system according to any one of the preceding claims further comprising a plurality of repeater units spaced along the interconnection at predetermined distances such that the data packets are transported between consecutive repeater units and/or nodes in a single clock cycle.
10. An interconnection system according to claim 9, wherein the data packets are pipelined between the nodes and/or repeater units
11. An interconnection system according to claim 9 or 10, wherein each repeater unit comprises means to compress data upon blockage of the interconnection.
12. An interconnection system according to any one the preceding claims, wherein the routing information includes the x,y coordinates of the destination.
13. An interconnection system according to any one of the preceding claims, wherein the clocking along the length of the interconnection system is distributed.
14. An interconnection system according to any one of the preceding claims, wherein each node comprises an input buffer, inject control and/or consume control.
15. An interconnection system according to claim 14, wherein each node can inject and output data at the same time.
16. An interconnection system according to any one of the preceding claims, wherein the interconnection system comprises a plurality of buses.
17. An interconnection system according to claim 16, wherein each node is connected to at least one of the plurality of buses.
18. An interconnection system according to claim 16 or 17, wherein at least a part of each bus comprises a pair of unidirectional bus lanes.
19. An interconnection system according to claim 15, wherein data is transported on each bus lane in an opposite direction.
20. An interconnection system according to any one of the preceding claims, further comprising at least one T-switch, the T-switch determining the direction to transport the data packets from the routing information associated with each data packet.
21. An interconnection system according to any one of the preceding claims, wherein delivery of the data packet is guaranteed.
22. A method for routing data packet between functional units, each data packet has routing information associated therewith, the method comprising the steps of:
(a) reading the routing information;
(b) determining the direction to transport the data packet from the routing information; and
(c) transporting the data packet in the direction determined in step (b).
23. A processing system incorporating the interconnection system according to any one of claims 1 to 21.
24. A system according to claim 23, wherein each functional unit is connected to a node via an interface unit.
25. A system according to claim 24, wherein each interface unit comprises means to set the protocol for data to be transported to the interconnection system and to be received from the interconnection system.
26. A system according to any one of claims 24 to 25, wherein the functional units access the interconnection system using distributed arbitration.
27. A system according to any one of claims 24 to 26, wherein each functional unit comprises a reusable system on chip functional unit.
28. An integrated circuit incorporating the interconnection system according to any one of claim 1 to 21.
29. An integrated system comprising a plurality of chips, each chip incorporating the interconnection system according to any one of claims 1 to 21, wherein the interconnection system interconnects the plurality of chips.
30. A method for transporting a plurality of data packets via an interconnection system, the interconnection system comprising a plurality of nodes, the method comprising the steps of:
transporting a data packet between a first node and a second node; and
during transportation, only activating the portion of the interconnection system between the first node and the second node.
US10/468,167 2001-02-14 2002-02-14 Interconnection system Abandoned US20040114609A1 (en)

Applications Claiming Priority (7)

Application Number Priority Date Filing Date Title
GB0103687.0 2001-02-14
GB0103678.9 2001-02-14
GB0103678A GB0103678D0 (en) 2001-02-14 2001-02-14 Network processing
GB0103687A GB0103687D0 (en) 2001-02-14 2001-02-14 Network processing-architecture II
GB0121790A GB0121790D0 (en) 2001-02-14 2001-09-10 Network processing systems
GB0121790.0 2001-09-10
PCT/GB2002/000662 WO2002065700A2 (en) 2001-02-14 2002-02-14 An interconnection system

Publications (1)

Publication Number Publication Date
US20040114609A1 true US20040114609A1 (en) 2004-06-17

Family

ID=27256074

Family Applications (10)

Application Number Title Priority Date Filing Date
US10/073,948 Expired - Fee Related US7856543B2 (en) 2001-02-14 2002-02-14 Data processing architectures for packet handling wherein batches of data packets of unpredictable size are distributed across processing elements arranged in a SIMD array operable to process different respective packet protocols at once while executing a single common instruction stream
US10/468,167 Abandoned US20040114609A1 (en) 2001-02-14 2002-02-14 Interconnection system
US10/074,022 Abandoned US20020159466A1 (en) 2001-02-14 2002-02-14 Lookup engine
US10/468,168 Expired - Fee Related US7290162B2 (en) 2001-02-14 2002-02-14 Clock distribution system
US10/074,019 Abandoned US20020161926A1 (en) 2001-02-14 2002-02-14 Method for controlling the order of datagrams
US11/151,271 Expired - Fee Related US8200686B2 (en) 2001-02-14 2005-06-14 Lookup engine
US11/151,292 Abandoned US20050242976A1 (en) 2001-02-14 2005-06-14 Lookup engine
US11/752,299 Expired - Fee Related US7818541B2 (en) 2001-02-14 2007-05-23 Data processing architectures
US11/752,300 Expired - Fee Related US7917727B2 (en) 2001-02-14 2007-05-23 Data processing architectures for packet handling using a SIMD array
US12/965,673 Expired - Fee Related US8127112B2 (en) 2001-02-14 2010-12-10 SIMD array operable to process different respective packet protocols simultaneously while executing a single common instruction stream

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US10/073,948 Expired - Fee Related US7856543B2 (en) 2001-02-14 2002-02-14 Data processing architectures for packet handling wherein batches of data packets of unpredictable size are distributed across processing elements arranged in a SIMD array operable to process different respective packet protocols at once while executing a single common instruction stream

Family Applications After (8)

Application Number Title Priority Date Filing Date
US10/074,022 Abandoned US20020159466A1 (en) 2001-02-14 2002-02-14 Lookup engine
US10/468,168 Expired - Fee Related US7290162B2 (en) 2001-02-14 2002-02-14 Clock distribution system
US10/074,019 Abandoned US20020161926A1 (en) 2001-02-14 2002-02-14 Method for controlling the order of datagrams
US11/151,271 Expired - Fee Related US8200686B2 (en) 2001-02-14 2005-06-14 Lookup engine
US11/151,292 Abandoned US20050242976A1 (en) 2001-02-14 2005-06-14 Lookup engine
US11/752,299 Expired - Fee Related US7818541B2 (en) 2001-02-14 2007-05-23 Data processing architectures
US11/752,300 Expired - Fee Related US7917727B2 (en) 2001-02-14 2007-05-23 Data processing architectures for packet handling using a SIMD array
US12/965,673 Expired - Fee Related US8127112B2 (en) 2001-02-14 2010-12-10 SIMD array operable to process different respective packet protocols simultaneously while executing a single common instruction stream

Country Status (6)

Country Link
US (10) US7856543B2 (en)
JP (2) JP2004524617A (en)
CN (2) CN1613041A (en)
AU (1) AU2002233500A1 (en)
GB (5) GB2390506B (en)
WO (2) WO2002065700A2 (en)

Cited By (42)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040042496A1 (en) * 2002-08-30 2004-03-04 Intel Corporation System including a segmentable, shared bus
US20050216625A1 (en) * 2004-03-09 2005-09-29 Smith Zachary S Suppressing production of bus transactions by a virtual-bus interface
US7055123B1 (en) * 2001-12-31 2006-05-30 Richard S. Norman High-performance interconnect arrangement for an array of discrete functional modules
US20070017694A1 (en) * 2005-07-20 2007-01-25 Tomoyuki Kubo Wiring board and manufacturing method for wiring board
US20070047584A1 (en) * 2005-08-24 2007-03-01 Spink Aaron T Interleaving data packets in a packet-based communication system
US20080276116A1 (en) * 2005-06-01 2008-11-06 Tobias Bjerregaard Method and an Apparatus for Providing Timing Signals to a Number of Circuits, an Integrated Circuit and a Node
US20090089030A1 (en) * 2007-09-28 2009-04-02 Rockwell Automation Technologies, Inc. Distributed simulation and synchronization
US20090089029A1 (en) * 2007-09-28 2009-04-02 Rockwell Automation Technologies, Inc. Enhanced execution speed to improve simulation performance
US20090089227A1 (en) * 2007-09-28 2009-04-02 Rockwell Automation Technologies, Inc. Automated recommendations from simulation
US20090089031A1 (en) * 2007-09-28 2009-04-02 Rockwell Automation Technologies, Inc. Integrated simulation of controllers and devices
US20090089234A1 (en) * 2007-09-28 2009-04-02 Rockwell Automation Technologies, Inc. Automated code generation for simulators
US20090089027A1 (en) * 2007-09-28 2009-04-02 Rockwell Automation Technologies, Inc. Simulation controls for model variablity and randomness
US20090268736A1 (en) * 2008-04-24 2009-10-29 Allison Brian D Early header CRC in data response packets with variable gap count
US20090271532A1 (en) * 2008-04-24 2009-10-29 Allison Brian D Early header CRC in data response packets with variable gap count
US20090268727A1 (en) * 2008-04-24 2009-10-29 Allison Brian D Early header CRC in data response packets with variable gap count
US20100241746A1 (en) * 2005-02-23 2010-09-23 International Business Machines Corporation Method, Program and System for Efficiently Hashing Packet Keys into a Firewall Connection Table
US20100278195A1 (en) * 2009-04-29 2010-11-04 Mahesh Wagh Packetized Interface For Coupling Agents
US7995618B1 (en) * 2007-10-01 2011-08-09 Teklatech A/S System and a method of transmitting data from a first device to a second device
US20130038427A1 (en) * 2010-03-12 2013-02-14 Zte Corporation Sight Spot Guiding System and Implementation Method Thereof
US20130229290A1 (en) * 2012-03-01 2013-09-05 Eaton Corporation Instrument panel bus interface
US20150012679A1 (en) * 2013-07-03 2015-01-08 Iii Holdings 2, Llc Implementing remote transaction functionalities between data processing nodes of a switched interconnect fabric
US10243882B1 (en) 2017-04-13 2019-03-26 Xilinx, Inc. Network on chip switch interconnect
US10505548B1 (en) 2018-05-25 2019-12-10 Xilinx, Inc. Multi-chip structure having configurable network-on-chip
US10503690B2 (en) 2018-02-23 2019-12-10 Xilinx, Inc. Programmable NOC compatible with multiple interface communication protocol
US10621129B2 (en) 2018-03-27 2020-04-14 Xilinx, Inc. Peripheral interconnect for configurable slave endpoint circuits
US10673745B2 (en) 2018-02-01 2020-06-02 Xilinx, Inc. End-to-end quality-of-service in a network-on-chip
US10680615B1 (en) 2019-03-27 2020-06-09 Xilinx, Inc. Circuit for and method of configuring and partially reconfiguring function blocks of an integrated circuit device
US20200250281A1 (en) * 2019-02-05 2020-08-06 Arm Limited Integrated circuit design and fabrication
US10824505B1 (en) 2018-08-21 2020-11-03 Xilinx, Inc. ECC proxy extension and byte organization for multi-master systems
US10838908B2 (en) 2018-07-20 2020-11-17 Xilinx, Inc. Configurable network-on-chip for a programmable device
US10891414B2 (en) 2019-05-23 2021-01-12 Xilinx, Inc. Hardware-software design flow for heterogeneous and programmable devices
US10891132B2 (en) 2019-05-23 2021-01-12 Xilinx, Inc. Flow convergence during hardware-software design for heterogeneous and programmable devices
US10936486B1 (en) 2019-02-21 2021-03-02 Xilinx, Inc. Address interleave support in a programmable device
US10963460B2 (en) 2018-12-06 2021-03-30 Xilinx, Inc. Integrated circuits and methods to accelerate data queries
US10977018B1 (en) 2019-12-05 2021-04-13 Xilinx, Inc. Development environment for heterogeneous devices
US11188312B2 (en) 2019-05-23 2021-11-30 Xilinx, Inc. Hardware-software design flow with high-level synthesis for heterogeneous and programmable devices
US20220027308A1 (en) * 2019-05-09 2022-01-27 SambaNova Systems, Inc. Control Barrier Network for Reconfigurable Data Processors
US11301295B1 (en) 2019-05-23 2022-04-12 Xilinx, Inc. Implementing an application specified as a data flow graph in an array of data processing engines
US11336287B1 (en) 2021-03-09 2022-05-17 Xilinx, Inc. Data processing engine array architecture with memory tiles
US11496418B1 (en) 2020-08-25 2022-11-08 Xilinx, Inc. Packet-based and time-multiplexed network-on-chip
US11520717B1 (en) 2021-03-09 2022-12-06 Xilinx, Inc. Memory tiles in data processing engine array
US11848670B2 (en) 2022-04-15 2023-12-19 Xilinx, Inc. Multiple partitions in a data processing array

Families Citing this family (232)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7549056B2 (en) 1999-03-19 2009-06-16 Broadcom Corporation System and method for processing and protecting content
US7174452B2 (en) * 2001-01-24 2007-02-06 Broadcom Corporation Method for processing multiple security policies applied to a data packet structure
US7856543B2 (en) 2001-02-14 2010-12-21 Rambus Inc. Data processing architectures for packet handling wherein batches of data packets of unpredictable size are distributed across processing elements arranged in a SIMD array operable to process different respective packet protocols at once while executing a single common instruction stream
US7383421B2 (en) * 2002-12-05 2008-06-03 Brightscale, Inc. Cellular engine for a data processing system
US7107478B2 (en) * 2002-12-05 2006-09-12 Connex Technology, Inc. Data processing system having a Cartesian Controller
US20030078997A1 (en) * 2001-10-22 2003-04-24 Franzel Kenneth S. Module and unified network backplane interface for local networks
FI113113B (en) 2001-11-20 2004-02-27 Nokia Corp Method and device for time synchronization of integrated circuits
US6836808B2 (en) * 2002-02-25 2004-12-28 International Business Machines Corporation Pipelined packet processing
US7627693B2 (en) * 2002-06-11 2009-12-01 Pandya Ashish A IP storage processor and engine therefor using RDMA
US7415723B2 (en) * 2002-06-11 2008-08-19 Pandya Ashish A Distributed network security system and a hardware processor therefor
US7408957B2 (en) * 2002-06-13 2008-08-05 International Business Machines Corporation Selective header field dispatch in a network processing system
US8015303B2 (en) * 2002-08-02 2011-09-06 Astute Networks Inc. High data rate stateful protocol processing
US7684400B2 (en) * 2002-08-08 2010-03-23 Intel Corporation Logarithmic time range-based multifield-correlation packet classification
US20040066779A1 (en) * 2002-10-04 2004-04-08 Craig Barrack Method and implementation for context switchover
US8037224B2 (en) 2002-10-08 2011-10-11 Netlogic Microsystems, Inc. Delegating network processor operations to star topology serial bus interfaces
US20050044324A1 (en) * 2002-10-08 2005-02-24 Abbas Rashid Advanced processor with mechanism for maximizing resource usage in an in-order pipeline with multiple threads
US7346757B2 (en) 2002-10-08 2008-03-18 Rmi Corporation Advanced processor translation lookaside buffer management in a multithreaded system
US9088474B2 (en) * 2002-10-08 2015-07-21 Broadcom Corporation Advanced processor with interfacing messaging network to a CPU
US8015567B2 (en) 2002-10-08 2011-09-06 Netlogic Microsystems, Inc. Advanced processor with mechanism for packet distribution at high line rate
US20050033831A1 (en) * 2002-10-08 2005-02-10 Abbas Rashid Advanced processor with a thread aware return address stack optimally used across active threads
US8478811B2 (en) 2002-10-08 2013-07-02 Netlogic Microsystems, Inc. Advanced processor with credit based scheme for optimal packet flow in a multi-processor system on a chip
US7984268B2 (en) 2002-10-08 2011-07-19 Netlogic Microsystems, Inc. Advanced processor scheduling in a multithreaded system
US7924828B2 (en) * 2002-10-08 2011-04-12 Netlogic Microsystems, Inc. Advanced processor with mechanism for fast packet queuing operations
US7627721B2 (en) 2002-10-08 2009-12-01 Rmi Corporation Advanced processor with cache coherency
US7961723B2 (en) * 2002-10-08 2011-06-14 Netlogic Microsystems, Inc. Advanced processor with mechanism for enforcing ordering between information sent on two independent networks
US7334086B2 (en) * 2002-10-08 2008-02-19 Rmi Corporation Advanced processor with system on a chip interconnect technology
US8176298B2 (en) 2002-10-08 2012-05-08 Netlogic Microsystems, Inc. Multi-core multi-threaded processing systems with instruction reordering in an in-order pipeline
US7596621B1 (en) * 2002-10-17 2009-09-29 Astute Networks, Inc. System and method for managing shared state using multiple programmed processors
US7814218B1 (en) 2002-10-17 2010-10-12 Astute Networks, Inc. Multi-protocol and multi-format stateful processing
US8151278B1 (en) 2002-10-17 2012-04-03 Astute Networks, Inc. System and method for timer management in a stateful protocol processing system
ATE438242T1 (en) * 2002-10-31 2009-08-15 Alcatel Lucent METHOD FOR PROCESSING DATA PACKETS AT LAYER THREE IN A TELECOMMUNICATIONS DEVICE
US7715392B2 (en) * 2002-12-12 2010-05-11 Stmicroelectronics, Inc. System and method for path compression optimization in a pipelined hardware bitmapped multi-bit trie algorithmic network search engine
JP4157403B2 (en) * 2003-03-19 2008-10-01 株式会社日立製作所 Packet communication device
US8477780B2 (en) * 2003-03-26 2013-07-02 Alcatel Lucent Processing packet information using an array of processing elements
US8539089B2 (en) * 2003-04-23 2013-09-17 Oracle America, Inc. System and method for vertical perimeter protection
EP1623330A2 (en) 2003-05-07 2006-02-08 Koninklijke Philips Electronics N.V. Processing system and method for transmitting data
US7558268B2 (en) * 2003-05-08 2009-07-07 Samsung Electronics Co., Ltd. Apparatus and method for combining forwarding tables in a distributed architecture router
US7500239B2 (en) * 2003-05-23 2009-03-03 Intel Corporation Packet processing system
US20050108518A1 (en) * 2003-06-10 2005-05-19 Pandya Ashish A. Runtime adaptable security processor
US7349958B2 (en) * 2003-06-25 2008-03-25 International Business Machines Corporation Method for improving performance in a computer storage system by regulating resource requests from clients
US7174398B2 (en) * 2003-06-26 2007-02-06 International Business Machines Corporation Method and apparatus for implementing data mapping with shuffle algorithm
US7702882B2 (en) * 2003-09-10 2010-04-20 Samsung Electronics Co., Ltd. Apparatus and method for performing high-speed lookups in a routing table
CA2442803A1 (en) * 2003-09-26 2005-03-26 Ibm Canada Limited - Ibm Canada Limitee Structure and method for managing workshares in a parallel region
US7886307B1 (en) * 2003-09-26 2011-02-08 The Mathworks, Inc. Object-oriented data transfer system for data sharing
US7120815B2 (en) * 2003-10-31 2006-10-10 Hewlett-Packard Development Company, L.P. Clock circuitry on plural integrated circuits
US7634500B1 (en) 2003-11-03 2009-12-15 Netlogic Microsystems, Inc. Multiple string searching using content addressable memory
AU2004297923B2 (en) * 2003-11-26 2008-07-10 Cisco Technology, Inc. Method and apparatus to inline encryption and decryption for a wireless station
US6954450B2 (en) * 2003-11-26 2005-10-11 Cisco Technology, Inc. Method and apparatus to provide data streaming over a network connection in a wireless MAC processor
US7340548B2 (en) 2003-12-17 2008-03-04 Microsoft Corporation On-chip bus
US7058424B2 (en) * 2004-01-20 2006-06-06 Lucent Technologies Inc. Method and apparatus for interconnecting wireless and wireline networks
GB0403237D0 (en) * 2004-02-13 2004-03-17 Imec Inter Uni Micro Electr A method for realizing ground bounce reduction in digital circuits adapted according to said method
US7903777B1 (en) * 2004-03-03 2011-03-08 Marvell International Ltd. System and method for reducing electromagnetic interference and ground bounce in an information communication system by controlling phase of clock signals among a plurality of information communication devices
US7478109B1 (en) * 2004-03-15 2009-01-13 Cisco Technology, Inc. Identification of a longest matching prefix based on a search of intervals corresponding to the prefixes
KR100990484B1 (en) * 2004-03-29 2010-10-29 삼성전자주식회사 Transmission clock signal generator for serial bus communication
US20050254486A1 (en) * 2004-05-13 2005-11-17 Ittiam Systems (P) Ltd. Multi processor implementation for signals requiring fast processing
DE102004035843B4 (en) * 2004-07-23 2010-04-15 Infineon Technologies Ag Router Network Processor
GB2417105B (en) 2004-08-13 2008-04-09 Clearspeed Technology Plc Processor memory system
US7913206B1 (en) * 2004-09-16 2011-03-22 Cadence Design Systems, Inc. Method and mechanism for performing partitioning of DRC operations
US7508397B1 (en) * 2004-11-10 2009-03-24 Nvidia Corporation Rendering of disjoint and overlapping blits
US8170019B2 (en) * 2004-11-30 2012-05-01 Broadcom Corporation CPU transmission of unmodified packets
US20060156316A1 (en) * 2004-12-18 2006-07-13 Gray Area Technologies System and method for application specific array processing
US20060212426A1 (en) * 2004-12-21 2006-09-21 Udaya Shakara Efficient CAM-based techniques to perform string searches in packet payloads
US7818705B1 (en) 2005-04-08 2010-10-19 Altera Corporation Method and apparatus for implementing a field programmable gate array architecture with programmable clock skew
WO2006127596A2 (en) * 2005-05-20 2006-11-30 Hillcrest Laboratories, Inc. Dynamic hyperlinking approach
US7373475B2 (en) * 2005-06-21 2008-05-13 Intel Corporation Methods for optimizing memory unit usage to maximize packet throughput for multi-processor multi-threaded architectures
US20070086456A1 (en) * 2005-08-12 2007-04-19 Electronics And Telecommunications Research Institute Integrated layer frame processing device including variable protocol header
US7904852B1 (en) 2005-09-12 2011-03-08 Cadence Design Systems, Inc. Method and system for implementing parallel processing of electronic design automation tools
US8218770B2 (en) * 2005-09-13 2012-07-10 Agere Systems Inc. Method and apparatus for secure key management and protection
US7353332B2 (en) * 2005-10-11 2008-04-01 Integrated Device Technology, Inc. Switching circuit implementing variable string matching
US7451293B2 (en) * 2005-10-21 2008-11-11 Brightscale Inc. Array of Boolean logic controlled processing elements with concurrent I/O processing and instruction sequencing
US7551609B2 (en) * 2005-10-21 2009-06-23 Cisco Technology, Inc. Data structure for storing and accessing multiple independent sets of forwarding information
US7835359B2 (en) * 2005-12-08 2010-11-16 International Business Machines Corporation Method and apparatus for striping message payload data over a network
JP2009523292A (en) * 2006-01-10 2009-06-18 ブライトスケール インコーポレイテッド Method and apparatus for scheduling multimedia data processing in parallel processing systems
US20070162531A1 (en) * 2006-01-12 2007-07-12 Bhaskar Kota Flow transform for integrated circuit design and simulation having combined data flow, control flow, and memory flow views
US8301885B2 (en) * 2006-01-27 2012-10-30 Fts Computertechnik Gmbh Time-controlled secure communication
KR20070088190A (en) * 2006-02-24 2007-08-29 삼성전자주식회사 Subword parallelism for processing multimedia data
EP2000973B1 (en) * 2006-03-30 2013-05-01 NEC Corporation Parallel image processing system control method and apparatus
US7617409B2 (en) * 2006-05-01 2009-11-10 Arm Limited System for checking clock-signal correspondence
US8390354B2 (en) * 2006-05-17 2013-03-05 Freescale Semiconductor, Inc. Delay configurable device and methods thereof
US8041929B2 (en) * 2006-06-16 2011-10-18 Cisco Technology, Inc. Techniques for hardware-assisted multi-threaded processing
JP2008004046A (en) * 2006-06-26 2008-01-10 Toshiba Corp Resource management device, and program for the same
US7584286B2 (en) * 2006-06-28 2009-09-01 Intel Corporation Flexible and extensible receive side scaling
US8448096B1 (en) 2006-06-30 2013-05-21 Cadence Design Systems, Inc. Method and system for parallel processing of IC design layouts
US7516437B1 (en) * 2006-07-20 2009-04-07 Xilinx, Inc. Skew-driven routing for networks
CN1909418B (en) * 2006-08-01 2010-05-12 华为技术有限公司 Clock distributing equipment for universal wireless interface and method for realizing speed switching
US20080040214A1 (en) * 2006-08-10 2008-02-14 Ip Commerce System and method for subsidizing payment transaction costs through online advertising
JP4846486B2 (en) * 2006-08-18 2011-12-28 富士通株式会社 Information processing apparatus and control method thereof
CA2557343C (en) * 2006-08-28 2015-09-22 Ibm Canada Limited-Ibm Canada Limitee Runtime code modification in a multi-threaded environment
WO2008027567A2 (en) * 2006-09-01 2008-03-06 Brightscale, Inc. Integral parallel machine
US20080059762A1 (en) * 2006-09-01 2008-03-06 Bogdan Mitu Multi-sequence control for a data parallel system
US9563433B1 (en) 2006-09-01 2017-02-07 Allsearch Semi Llc System and method for class-based execution of an instruction broadcasted to an array of processing elements
US20080055307A1 (en) * 2006-09-01 2008-03-06 Lazar Bivolarski Graphics rendering pipeline
US20080244238A1 (en) * 2006-09-01 2008-10-02 Bogdan Mitu Stream processing accelerator
US20080059763A1 (en) * 2006-09-01 2008-03-06 Lazar Bivolarski System and method for fine-grain instruction parallelism for increased efficiency of processing compressed multimedia data
US20080059467A1 (en) * 2006-09-05 2008-03-06 Lazar Bivolarski Near full motion search algorithm
US7657856B1 (en) 2006-09-12 2010-02-02 Cadence Design Systems, Inc. Method and system for parallel processing of IC design layouts
US7783654B1 (en) 2006-09-19 2010-08-24 Netlogic Microsystems, Inc. Multiple string searching using content addressable memory
JP4377899B2 (en) * 2006-09-20 2009-12-02 株式会社東芝 Resource management apparatus and program
US8010966B2 (en) * 2006-09-27 2011-08-30 Cisco Technology, Inc. Multi-threaded processing using path locks
US8179896B2 (en) 2006-11-09 2012-05-15 Justin Mark Sobaje Network processors and pipeline optimization methods
US7996348B2 (en) 2006-12-08 2011-08-09 Pandya Ashish A 100GBPS security and search architecture using programmable intelligent search memory (PRISM) that comprises one or more bit interval counters
US9141557B2 (en) 2006-12-08 2015-09-22 Ashish A. Pandya Dynamic random access memory (DRAM) that comprises a programmable intelligent search memory (PRISM) and a cryptography processing engine
JP4249780B2 (en) * 2006-12-26 2009-04-08 株式会社東芝 Device and program for managing resources
US7676444B1 (en) 2007-01-18 2010-03-09 Netlogic Microsystems, Inc. Iterative compare operations using next success size bitmap
ATE508415T1 (en) * 2007-03-06 2011-05-15 Nec Corp DATA TRANSFER NETWORK AND CONTROL DEVICE FOR A SYSTEM HAVING AN ARRAY OF PROCESSING ELEMENTS EACH EITHER SELF-CONTROLLED OR JOINTLY CONTROLLED
JP2009086733A (en) * 2007-09-27 2009-04-23 Toshiba Corp Information processor, control method of information processor and control program of information processor
US8515052B2 (en) * 2007-12-17 2013-08-20 Wai Wu Parallel signal processing system and method
US9596324B2 (en) 2008-02-08 2017-03-14 Broadcom Corporation System and method for parsing and allocating a plurality of packets to processor core threads
US8250578B2 (en) * 2008-02-22 2012-08-21 International Business Machines Corporation Pipelining hardware accelerators to computer systems
US8726289B2 (en) * 2008-02-22 2014-05-13 International Business Machines Corporation Streaming attachment of hardware accelerators to computer systems
CN102077493B (en) * 2008-04-30 2015-01-14 惠普开发有限公司 Intentionally skewed optical clock signal distribution
JP2009271724A (en) * 2008-05-07 2009-11-19 Toshiba Corp Hardware engine controller
US9619428B2 (en) 2008-05-30 2017-04-11 Advanced Micro Devices, Inc. SIMD processing unit with local data share and access to a global data share of a GPU
US8958419B2 (en) * 2008-06-16 2015-02-17 Intel Corporation Switch fabric primitives
US8566487B2 (en) * 2008-06-24 2013-10-22 Hartvig Ekner System and method for creating a scalable monolithic packet processing engine
US7949007B1 (en) 2008-08-05 2011-05-24 Xilinx, Inc. Methods of clustering actions for manipulating packets of a communication protocol
US7804844B1 (en) * 2008-08-05 2010-09-28 Xilinx, Inc. Dataflow pipeline implementing actions for manipulating packets of a communication protocol
US8311057B1 (en) 2008-08-05 2012-11-13 Xilinx, Inc. Managing formatting of packets of a communication protocol
US8160092B1 (en) 2008-08-05 2012-04-17 Xilinx, Inc. Transforming a declarative description of a packet processor
EP2327026A1 (en) * 2008-08-06 2011-06-01 Nxp B.V. Simd parallel processor architecture
CN101355482B (en) * 2008-09-04 2011-09-21 中兴通讯股份有限公司 Equipment, method and system for implementing identification of embedded device address sequence
US8493979B2 (en) * 2008-12-30 2013-07-23 Intel Corporation Single instruction processing of network packets
JP5238525B2 (en) * 2009-01-13 2013-07-17 株式会社東芝 Device and program for managing resources
KR101553652B1 (en) * 2009-02-18 2015-09-16 삼성전자 주식회사 Apparatus and method for compiling instruction for heterogeneous processor
US8140792B2 (en) * 2009-02-25 2012-03-20 International Business Machines Corporation Indirectly-accessed, hardware-affine channel storage in transaction-oriented DMA-intensive environments
US8874878B2 (en) * 2010-05-18 2014-10-28 Lsi Corporation Thread synchronization in a multi-thread, multi-flow network communications processor architecture
US9461930B2 (en) 2009-04-27 2016-10-04 Intel Corporation Modifying data streams without reordering in a multi-thread, multi-flow network processor
US8743877B2 (en) * 2009-12-21 2014-06-03 Steven L. Pope Header processing engine
US8332460B2 (en) * 2010-04-14 2012-12-11 International Business Machines Corporation Performing a local reduction operation on a parallel computer
KR20130141446A (en) * 2010-07-19 2013-12-26 어드밴스드 마이크로 디바이시즈, 인코포레이티드 Data processing using on-chip memory in multiple processing units
US8880507B2 (en) * 2010-07-22 2014-11-04 Brocade Communications Systems, Inc. Longest prefix match using binary search tree
US8904115B2 (en) * 2010-09-28 2014-12-02 Texas Instruments Incorporated Cache with multiple access pipelines
RU2436151C1 (en) * 2010-11-01 2011-12-10 Федеральное государственное унитарное предприятие "Российский Федеральный ядерный центр - Всероссийский научно-исследовательский институт экспериментальной физики" (ФГУП "РФЯЦ-ВНИИЭФ") Method of determining structure of hybrid computer system
US9667539B2 (en) * 2011-01-17 2017-05-30 Alcatel Lucent Method and apparatus for providing transport of customer QoS information via PBB networks
US8869162B2 (en) 2011-04-26 2014-10-21 Microsoft Corporation Stream processing on heterogeneous hardware devices
US9020892B2 (en) * 2011-07-08 2015-04-28 Microsoft Technology Licensing, Llc Efficient metadata storage
US8880494B2 (en) 2011-07-28 2014-11-04 Brocade Communications Systems, Inc. Longest prefix match scheme
US8923306B2 (en) 2011-08-02 2014-12-30 Cavium, Inc. Phased bucket pre-fetch in a network processor
US9344366B2 (en) 2011-08-02 2016-05-17 Cavium, Inc. System and method for rule matching in a processor
US8910178B2 (en) 2011-08-10 2014-12-09 International Business Machines Corporation Performing a global barrier operation in a parallel computer
US9154335B2 (en) * 2011-11-08 2015-10-06 Marvell Israel (M.I.S.L) Ltd. Method and apparatus for transmitting data on a network
US9542236B2 (en) 2011-12-29 2017-01-10 Oracle International Corporation Efficiency sequencer for multiple concurrently-executing threads of execution
WO2013100783A1 (en) * 2011-12-29 2013-07-04 Intel Corporation Method and system for control signalling in a data path module
US9495135B2 (en) 2012-02-09 2016-11-15 International Business Machines Corporation Developing collective operations for a parallel computer
US9178730B2 (en) 2012-02-24 2015-11-03 Freescale Semiconductor, Inc. Clock distribution module, synchronous digital system and method therefor
JP6353359B2 (en) * 2012-03-23 2018-07-04 株式会社Mush−A Data processing apparatus, data processing system, data structure, recording medium, storage device, and data processing method
JP2013222364A (en) * 2012-04-18 2013-10-28 Renesas Electronics Corp Signal processing circuit
US8775727B2 (en) 2012-08-31 2014-07-08 Lsi Corporation Lookup engine with pipelined access, speculative add and lock-in-hit function
US9082078B2 (en) 2012-07-27 2015-07-14 The Intellisis Corporation Neural processing engine and architecture using the same
CN103631315A (en) * 2012-08-22 2014-03-12 上海华虹集成电路有限责任公司 Clock design method facilitating timing sequence repair
US9185057B2 (en) * 2012-12-05 2015-11-10 The Intellisis Corporation Smart memory
US9639371B2 (en) * 2013-01-29 2017-05-02 Advanced Micro Devices, Inc. Solution to divergent branches in a SIMD core using hardware pointers
US9391893B2 (en) 2013-02-26 2016-07-12 Dell Products L.P. Lookup engine for an information handling system
US20140269690A1 (en) * 2013-03-13 2014-09-18 Qualcomm Incorporated Network element with distributed flow tables
US9185003B1 (en) * 2013-05-02 2015-11-10 Amazon Technologies, Inc. Distributed clock network with time synchronization and activity tracing between nodes
US10331583B2 (en) 2013-09-26 2019-06-25 Intel Corporation Executing distributed memory operations using processing elements connected by distributed channels
ES2649163T3 (en) * 2013-10-11 2018-01-10 Wpt Gmbh Elastic floor covering in the form of a continuously rolling material
US20150120224A1 (en) * 2013-10-29 2015-04-30 C3 Energy, Inc. Systems and methods for processing data relating to energy usage
EP3075135A1 (en) 2013-11-29 2016-10-05 Nec Corporation Apparatus, system and method for mtc
US9547553B1 (en) 2014-03-10 2017-01-17 Parallel Machines Ltd. Data resiliency in a shared memory pool
US9372724B2 (en) * 2014-04-01 2016-06-21 Freescale Semiconductor, Inc. System and method for conditional task switching during ordering scope transitions
US9372723B2 (en) * 2014-04-01 2016-06-21 Freescale Semiconductor, Inc. System and method for conditional task switching during ordering scope transitions
US9781027B1 (en) 2014-04-06 2017-10-03 Parallel Machines Ltd. Systems and methods to communicate with external destinations via a memory network
US9690713B1 (en) 2014-04-22 2017-06-27 Parallel Machines Ltd. Systems and methods for effectively interacting with a flash memory
US9529622B1 (en) 2014-12-09 2016-12-27 Parallel Machines Ltd. Systems and methods for automatic generation of task-splitting code
US9477412B1 (en) 2014-12-09 2016-10-25 Parallel Machines Ltd. Systems and methods for automatically aggregating write requests
US9733981B2 (en) 2014-06-10 2017-08-15 Nxp Usa, Inc. System and method for conditional task switching during ordering scope transitions
US9753873B1 (en) 2014-12-09 2017-09-05 Parallel Machines Ltd. Systems and methods for key-value transactions
US9781225B1 (en) 2014-12-09 2017-10-03 Parallel Machines Ltd. Systems and methods for cache streams
US9639473B1 (en) 2014-12-09 2017-05-02 Parallel Machines Ltd. Utilizing a cache mechanism by copying a data set from a cache-disabled memory location to a cache-enabled memory location
US9632936B1 (en) 2014-12-09 2017-04-25 Parallel Machines Ltd. Two-tier distributed memory
WO2016118979A2 (en) 2015-01-23 2016-07-28 C3, Inc. Systems, methods, and devices for an enterprise internet-of-things application development platform
US9552327B2 (en) 2015-01-29 2017-01-24 Knuedge Incorporated Memory controller for a network on a chip device
US10061531B2 (en) 2015-01-29 2018-08-28 Knuedge Incorporated Uniform system wide addressing for a computing system
US9749225B2 (en) * 2015-04-17 2017-08-29 Huawei Technologies Co., Ltd. Software defined network (SDN) control signaling for traffic engineering to enable multi-type transport in a data plane
US20160381136A1 (en) * 2015-06-24 2016-12-29 Futurewei Technologies, Inc. System, method, and computer program for providing rest services to fine-grained resources based on a resource-oriented network
CN106326967B (en) * 2015-06-29 2023-05-05 四川谦泰仁投资管理有限公司 RFID chip with interactive switch input port
US10313231B1 (en) * 2016-02-08 2019-06-04 Barefoot Networks, Inc. Resilient hashing for forwarding packets
US10063407B1 (en) 2016-02-08 2018-08-28 Barefoot Networks, Inc. Identifying and marking failed egress links in data plane
US10027583B2 (en) 2016-03-22 2018-07-17 Knuedge Incorporated Chained packet sequences in a network on a chip architecture
US9595308B1 (en) 2016-03-31 2017-03-14 Altera Corporation Multiple-die synchronous insertion delay measurement circuit and methods
US10346049B2 (en) 2016-04-29 2019-07-09 Friday Harbor Llc Distributed contiguous reads in a network on a chip architecture
US10402168B2 (en) 2016-10-01 2019-09-03 Intel Corporation Low energy consumption mantissa multiplication for floating point multiply-add operations
US10664942B2 (en) * 2016-10-21 2020-05-26 Advanced Micro Devices, Inc. Reconfigurable virtual graphics and compute processor pipeline
US10084687B1 (en) 2016-11-17 2018-09-25 Barefoot Networks, Inc. Weighted-cost multi-pathing using range lookups
US10416999B2 (en) 2016-12-30 2019-09-17 Intel Corporation Processors, methods, and systems with a configurable spatial accelerator
US10474375B2 (en) 2016-12-30 2019-11-12 Intel Corporation Runtime address disambiguation in acceleration hardware
US10558575B2 (en) 2016-12-30 2020-02-11 Intel Corporation Processors, methods, and systems with a configurable spatial accelerator
US10572376B2 (en) 2016-12-30 2020-02-25 Intel Corporation Memory ordering in acceleration hardware
US10237206B1 (en) 2017-03-05 2019-03-19 Barefoot Networks, Inc. Equal cost multiple path group failover for multicast
US10404619B1 (en) 2017-03-05 2019-09-03 Barefoot Networks, Inc. Link aggregation group failover for multicast
US10296351B1 (en) * 2017-03-15 2019-05-21 Ambarella, Inc. Computer vision processing in hardware data paths
CN107704922B (en) * 2017-04-19 2020-12-08 赛灵思公司 Artificial neural network processing device
CN107679621B (en) 2017-04-19 2020-12-08 赛灵思公司 Artificial neural network processing device
CN107679620B (en) * 2017-04-19 2020-05-26 赛灵思公司 Artificial neural network processing device
US10514719B2 (en) * 2017-06-27 2019-12-24 Biosense Webster (Israel) Ltd. System and method for synchronization among clocks in a wireless system
US10445451B2 (en) 2017-07-01 2019-10-15 Intel Corporation Processors, methods, and systems for a configurable spatial accelerator with performance, correctness, and power reduction features
US10387319B2 (en) 2017-07-01 2019-08-20 Intel Corporation Processors, methods, and systems for a configurable spatial accelerator with memory system performance, power reduction, and atomics support features
US10467183B2 (en) 2017-07-01 2019-11-05 Intel Corporation Processors and methods for pipelined runtime services in a spatial array
US10445234B2 (en) 2017-07-01 2019-10-15 Intel Corporation Processors, methods, and systems for a configurable spatial accelerator with transactional and replay features
US10515049B1 (en) 2017-07-01 2019-12-24 Intel Corporation Memory circuits and methods for distributed memory hazard detection and error recovery
US10515046B2 (en) 2017-07-01 2019-12-24 Intel Corporation Processors, methods, and systems with a configurable spatial accelerator
US10469397B2 (en) 2017-07-01 2019-11-05 Intel Corporation Processors and methods with configurable network-based dataflow operator circuits
US10496574B2 (en) 2017-09-28 2019-12-03 Intel Corporation Processors, methods, and systems for a memory fence in a configurable spatial accelerator
US11086816B2 (en) 2017-09-28 2021-08-10 Intel Corporation Processors, methods, and systems for debugging a configurable spatial accelerator
US10445098B2 (en) 2017-09-30 2019-10-15 Intel Corporation Processors and methods for privileged configuration in a spatial array
US10380063B2 (en) 2017-09-30 2019-08-13 Intel Corporation Processors, methods, and systems with a configurable spatial accelerator having a sequencer dataflow operator
CN107831824B (en) * 2017-10-16 2021-04-06 北京比特大陆科技有限公司 Clock signal transmission method and device, multiplexing chip and electronic equipment
GB2568087B (en) * 2017-11-03 2022-07-20 Imagination Tech Ltd Activation functions for deep neural networks
US10565134B2 (en) 2017-12-30 2020-02-18 Intel Corporation Apparatus, methods, and systems for multicast in a configurable spatial accelerator
US10417175B2 (en) 2017-12-30 2019-09-17 Intel Corporation Apparatus, methods, and systems for memory consistency in a configurable spatial accelerator
US10445250B2 (en) 2017-12-30 2019-10-15 Intel Corporation Apparatus, methods, and systems with a configurable spatial accelerator
JP2019153909A (en) * 2018-03-02 2019-09-12 株式会社リコー Semiconductor integrated circuit and clock supply method
US11307873B2 (en) 2018-04-03 2022-04-19 Intel Corporation Apparatus, methods, and systems for unstructured data flow in a configurable spatial accelerator with predicate propagation and merging
US10564980B2 (en) 2018-04-03 2020-02-18 Intel Corporation Apparatus, methods, and systems for conditional queues in a configurable spatial accelerator
US10853073B2 (en) 2018-06-30 2020-12-01 Intel Corporation Apparatuses, methods, and systems for conditional operations in a configurable spatial accelerator
US10891240B2 (en) 2018-06-30 2021-01-12 Intel Corporation Apparatus, methods, and systems for low latency communication in a configurable spatial accelerator
US11200186B2 (en) 2018-06-30 2021-12-14 Intel Corporation Apparatuses, methods, and systems for operations in a configurable spatial accelerator
US10459866B1 (en) 2018-06-30 2019-10-29 Intel Corporation Apparatuses, methods, and systems for integrated control and data processing in a configurable spatial accelerator
US11176281B2 (en) * 2018-10-08 2021-11-16 Micron Technology, Inc. Security managers and methods for implementing security protocols in a reconfigurable fabric
US10678724B1 (en) 2018-12-29 2020-06-09 Intel Corporation Apparatuses, methods, and systems for in-network storage in a configurable spatial accelerator
US11029927B2 (en) 2019-03-30 2021-06-08 Intel Corporation Methods and apparatus to detect and annotate backedges in a dataflow graph
US10965536B2 (en) 2019-03-30 2021-03-30 Intel Corporation Methods and apparatus to insert buffers in a dataflow graph
US10817291B2 (en) 2019-03-30 2020-10-27 Intel Corporation Apparatuses, methods, and systems for swizzle operations in a configurable spatial accelerator
US10915471B2 (en) 2019-03-30 2021-02-09 Intel Corporation Apparatuses, methods, and systems for memory interface circuit allocation in a configurable spatial accelerator
US11288244B2 (en) * 2019-06-10 2022-03-29 Akamai Technologies, Inc. Tree deduplication
US11037050B2 (en) 2019-06-29 2021-06-15 Intel Corporation Apparatuses, methods, and systems for memory interface circuit arbitration in a configurable spatial accelerator
US11907713B2 (en) 2019-12-28 2024-02-20 Intel Corporation Apparatuses, methods, and systems for fused operations using sign modification in a processing element of a configurable spatial accelerator
US11320885B2 (en) 2020-05-26 2022-05-03 Dell Products L.P. Wide range power mechanism for over-speed memory design
CN114528246A (en) * 2020-11-23 2022-05-24 深圳比特微电子科技有限公司 Operation core, calculation chip and encrypted currency mining machine
US11768714B2 (en) 2021-06-22 2023-09-26 Microsoft Technology Licensing, Llc On-chip hardware semaphore array supporting multiple conditionals
US11797480B2 (en) * 2021-12-31 2023-10-24 Tsx Inc. Storage of order books with persistent data structures

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5659781A (en) * 1994-06-29 1997-08-19 Larson; Noble G. Bidirectional systolic ring network
US5828858A (en) * 1996-09-16 1998-10-27 Virginia Tech Intellectual Properties, Inc. Worm-hole run-time reconfigurable processor field programmable gate array (FPGA)
US5923660A (en) * 1996-01-31 1999-07-13 Galileo Technologies Ltd. Switching ethernet controller
US6009488A (en) * 1997-11-07 1999-12-28 Microlinc, Llc Computer having packet-based interconnect channel
US6208619B1 (en) * 1997-03-27 2001-03-27 Kabushiki Kaisha Toshiba Packet data flow control method and device
US6366584B1 (en) * 1999-02-06 2002-04-02 Triton Network Systems, Inc. Commercial network based on point to point radios

Family Cites Families (151)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
IT1061921B (en) * 1976-06-23 1983-04-30 Lolli & C Spa IMPROVEMENT IN DIFFUSERS FOR AIR CONDITIONING SYSTEMS
USD259208S (en) * 1979-04-23 1981-05-12 Mccullough John R Roof vent
GB8401805D0 (en) * 1984-01-24 1984-02-29 Int Computers Ltd Data processing apparatus
JPS61156338A (en) * 1984-12-27 1986-07-16 Toshiba Corp Multiprocessor system
US4641571A (en) * 1985-07-15 1987-02-10 Enamel Products & Plating Co. Turbo fan vent
US4850027A (en) * 1985-07-26 1989-07-18 International Business Machines Corporation Configurable parallel pipeline image processing system
JP2564805B2 (en) * 1985-08-08 1996-12-18 日本電気株式会社 Information processing device
JPH0771111B2 (en) * 1985-09-13 1995-07-31 日本電気株式会社 Packet exchange processor
US5021947A (en) * 1986-03-31 1991-06-04 Hughes Aircraft Company Data-flow multiprocessor architecture with three dimensional multistage interconnection network for efficient signal and data processing
GB8618943D0 (en) * 1986-08-02 1986-09-10 Int Computers Ltd Data processing apparatus
DE3751412T2 (en) * 1986-09-02 1995-12-14 Fuji Photo Film Co Ltd Method and device for image processing with gradation correction of the image signal.
US5418970A (en) * 1986-12-17 1995-05-23 Massachusetts Institute Of Technology Parallel processing system with processor array with processing elements addressing associated memories using host supplied address value and base register content
GB8723203D0 (en) * 1987-10-02 1987-11-04 Crosfield Electronics Ltd Interactive image modification
DE3742941A1 (en) * 1987-12-18 1989-07-06 Standard Elektrik Lorenz Ag PACKAGE BROKERS
JP2559262B2 (en) * 1988-10-13 1996-12-04 富士写真フイルム株式会社 Magnetic disk
JPH02105910A (en) * 1988-10-14 1990-04-18 Hitachi Ltd Logic integrated circuit
AU620994B2 (en) * 1989-07-12 1992-02-27 Digital Equipment Corporation Compressed prefix matching database searching
US5212777A (en) * 1989-11-17 1993-05-18 Texas Instruments Incorporated Multi-processor reconfigurable in single instruction multiple data (SIMD) and multiple instruction multiple data (MIMD) modes and method of operation
US5218709A (en) * 1989-12-28 1993-06-08 The United States Of America As Represented By The Administrator Of The National Aeronautics And Space Administration Special purpose parallel computer architecture for real-time control and simulation in robotic applications
US5426610A (en) * 1990-03-01 1995-06-20 Texas Instruments Incorporated Storage circuitry using sense amplifier with temporary pause for voltage supply isolation
JPH04219859A (en) 1990-03-12 1992-08-10 Hewlett Packard Co <Hp> Harware distributor which distributes series-instruction-stream data to parallel processors
US5327159A (en) * 1990-06-27 1994-07-05 Texas Instruments Incorporated Packed bus selection of multiple pixel depths in palette devices, systems and methods
US5121198A (en) * 1990-06-28 1992-06-09 Eastman Kodak Company Method of setting the contrast of a color video picture in a computer controlled photographic film analyzing system
US5963746A (en) 1990-11-13 1999-10-05 International Business Machines Corporation Fully distributed processing memory element
US5752067A (en) 1990-11-13 1998-05-12 International Business Machines Corporation Fully scalable parallel processing system having asynchronous SIMD processing
US5590345A (en) 1990-11-13 1996-12-31 International Business Machines Corporation Advanced parallel array processor(APAP)
US5765011A (en) 1990-11-13 1998-06-09 International Business Machines Corporation Parallel processing system having a synchronous SIMD processing with processing elements emulating SIMD operation using individual instruction streams
US5625836A (en) 1990-11-13 1997-04-29 International Business Machines Corporation SIMD/MIMD processing memory element (PME)
US5367643A (en) * 1991-02-06 1994-11-22 International Business Machines Corporation Generic high bandwidth adapter having data packet memory configured in three level hierarchy for temporary storage of variable length data packets
US5285528A (en) * 1991-02-22 1994-02-08 International Business Machines Corporation Data structures and algorithms for managing lock states of addressable element ranges
WO1992015960A1 (en) * 1991-03-05 1992-09-17 Hajime Seki Electronic computer system and processor elements used for this system
US5313582A (en) * 1991-04-30 1994-05-17 Standard Microsystems Corporation Method and apparatus for buffering data within stations of a communication network
US5224100A (en) * 1991-05-09 1993-06-29 David Sarnoff Research Center, Inc. Routing technique for a hierarchical interprocessor-communication network between massively-parallel processors
EP0593609A1 (en) * 1991-07-01 1994-04-27 Telstra Corporation Limited High speed switching architecture
US5404550A (en) * 1991-07-25 1995-04-04 Tandem Computers Incorporated Method and apparatus for executing tasks by following a linked list of memory packets
US5155484A (en) * 1991-09-13 1992-10-13 Salient Software, Inc. Fast data compressor with direct lookup table indexing into history buffer
JP2750968B2 (en) * 1991-11-18 1998-05-18 シャープ株式会社 Data driven information processor
US5307381A (en) * 1991-12-27 1994-04-26 Intel Corporation Skew-free clock signal distribution network in a microprocessor
US5603028A (en) * 1992-03-02 1997-02-11 Mitsubishi Denki Kabushiki Kaisha Method and apparatus for data distribution
JPH0696035A (en) 1992-09-16 1994-04-08 Sanyo Electric Co Ltd Processing element and parallel processing computer using the same
EP0601715A1 (en) * 1992-12-11 1994-06-15 National Semiconductor Corporation Bus of CPU core optimized for accessing on-chip memory devices
US5579223A (en) * 1992-12-24 1996-11-26 Microsoft Corporation Method and system for incorporating modifications made to a computer program into a translated version of the computer program
GB2277235B (en) * 1993-04-14 1998-01-07 Plessey Telecomm Apparatus and method for the digital transmission of data
US5640551A (en) * 1993-04-14 1997-06-17 Apple Computer, Inc. Efficient high speed trie search process
US5420858A (en) * 1993-05-05 1995-05-30 Synoptics Communications, Inc. Method and apparatus for communications from a non-ATM communication medium to an ATM communication medium
JP2629568B2 (en) * 1993-07-30 1997-07-09 日本電気株式会社 ATM cell switching system
US5918061A (en) * 1993-12-29 1999-06-29 Intel Corporation Enhanced power managing unit (PMU) in a multiprocessor chip
US5524223A (en) 1994-01-31 1996-06-04 Motorola, Inc. Instruction accelerator for processing loop instructions with address generator using multiple stored increment values
US5423003A (en) * 1994-03-03 1995-06-06 Geonet Limited L.P. System for managing network computer applications
DE69428186T2 (en) * 1994-04-28 2002-03-28 Hewlett Packard Co Multicast device
EP0681236B1 (en) * 1994-05-05 2000-11-22 Conexant Systems, Inc. Space vector data path
BR9506208A (en) 1994-05-06 1996-04-23 Motorola Inc Communication system and process for routing calls to a terminal
US5463732A (en) * 1994-05-13 1995-10-31 David Sarnoff Research Center, Inc. Method and apparatus for accessing a distributed data buffer
US5682480A (en) * 1994-08-15 1997-10-28 Hitachi, Ltd. Parallel computer system for performing barrier synchronization by transferring the synchronization packet through a path which bypasses the packet buffer in response to an interrupt
US5949781A (en) * 1994-08-31 1999-09-07 Brooktree Corporation Controller for ATM segmentation and reassembly
US5586119A (en) * 1994-08-31 1996-12-17 Motorola, Inc. Method and apparatus for packet alignment in a communication system
US5754584A (en) * 1994-09-09 1998-05-19 Omnipoint Corporation Non-coherent spread-spectrum continuous-phase modulation communication system
JPH10508714A (en) * 1994-11-07 1998-08-25 テンプル ユニヴァーシティ − オブ ザ カモン ウェルス システム オブ ハイヤー エデュケイション Multicomputer system and method
US5651099A (en) * 1995-01-26 1997-07-22 Hewlett-Packard Company Use of a genetic algorithm to optimize memory space
JPH08249306A (en) * 1995-03-09 1996-09-27 Sharp Corp Data driven type information processor
US5634068A (en) * 1995-03-31 1997-05-27 Sun Microsystems, Inc. Packet switched cache coherent multiprocessor system
US5835095A (en) * 1995-05-08 1998-11-10 Intergraph Corporation Visible line processor
JP3515263B2 (en) * 1995-05-18 2004-04-05 株式会社東芝 Router device, data communication network system, node device, data transfer method, and network connection method
US5689677A (en) 1995-06-05 1997-11-18 Macmillan; David C. Circuit for enhancing performance of a computer for personal use
US6147996A (en) * 1995-08-04 2000-11-14 Cisco Technology, Inc. Pipelined multiple issue packet switch
US6115802A (en) * 1995-10-13 2000-09-05 Sun Mircrosystems, Inc. Efficient hash table for use in multi-threaded environments
US5612956A (en) * 1995-12-15 1997-03-18 General Instrument Corporation Of Delaware Reformatting of variable rate data for fixed rate communication
US5822606A (en) * 1996-01-11 1998-10-13 Morton; Steven G. DSP having a plurality of like processors controlled in parallel by an instruction word, and a control processor also controlled by the instruction word
EP0879544B1 (en) * 1996-02-06 2003-05-02 International Business Machines Corporation Parallel on-the-fly processing of fixed length cells
US5781549A (en) * 1996-02-23 1998-07-14 Allied Telesyn International Corp. Method and apparatus for switching data packets in a data network
US6035193A (en) 1996-06-28 2000-03-07 At&T Wireless Services Inc. Telephone system having land-line-supported private base station switchable into cellular network
US6101176A (en) 1996-07-24 2000-08-08 Nokia Mobile Phones Method and apparatus for operating an indoor CDMA telecommunications system
US6088355A (en) * 1996-10-11 2000-07-11 C-Cube Microsystems, Inc. Processing system with pointer-based ATM segmentation and reassembly
US6791947B2 (en) * 1996-12-16 2004-09-14 Juniper Networks In-line packet processing
JP3000961B2 (en) * 1997-06-06 2000-01-17 日本電気株式会社 Semiconductor integrated circuit
US5969559A (en) * 1997-06-09 1999-10-19 Schwartz; David M. Method and apparatus for using a power grid for clock distribution in semiconductor integrated circuits
US5828870A (en) * 1997-06-30 1998-10-27 Adaptec, Inc. Method and apparatus for controlling clock skew in an integrated circuit
JP3469046B2 (en) * 1997-07-08 2003-11-25 株式会社東芝 Functional block and semiconductor integrated circuit device
US6047304A (en) * 1997-07-29 2000-04-04 Nortel Networks Corporation Method and apparatus for performing lane arithmetic to perform network processing
WO1999014893A2 (en) * 1997-09-17 1999-03-25 Sony Electronics Inc. Multi-port bridge with triplet architecture and periodical update of address look-up table
JPH11194850A (en) * 1997-09-19 1999-07-21 Lsi Logic Corp Clock distribution network for integrated circuit, and clock distribution method
US5872993A (en) * 1997-12-01 1999-02-16 Advanced Micro Devices, Inc. Communications system with multiple, simultaneous accesses to a memory
US6081523A (en) * 1997-12-05 2000-06-27 Advanced Micro Devices, Inc. Arrangement for transmitting packet data segments from a media access controller across multiple physical links
US6219796B1 (en) * 1997-12-23 2001-04-17 Texas Instruments Incorporated Power reduction for processors by software control of functional units
US6301603B1 (en) * 1998-02-17 2001-10-09 Euphonics Incorporated Scalable audio processing on a heterogeneous processor array
JP3490286B2 (en) 1998-03-13 2004-01-26 株式会社東芝 Router device and frame transfer method
JPH11272629A (en) * 1998-03-19 1999-10-08 Hitachi Ltd Data processor
US6052769A (en) * 1998-03-31 2000-04-18 Intel Corporation Method and apparatus for moving select non-contiguous bytes of packed data in a single instruction
US6275508B1 (en) * 1998-04-21 2001-08-14 Nexabit Networks, Llc Method of and system for processing datagram headers for high speed computer network interfaces at low clock speeds, utilizing scalable algorithms for performing such network header adaptation (SAPNA)
WO1999057858A1 (en) * 1998-05-07 1999-11-11 Cabletron Systems, Inc. Multiple priority buffering in a computer network
US6131102A (en) * 1998-06-15 2000-10-10 Microsoft Corporation Method and system for cost computation of spelling suggestions and automatic replacement
US6305001B1 (en) * 1998-06-18 2001-10-16 Lsi Logic Corporation Clock distribution network planning and method therefor
EP0991231B1 (en) 1998-09-10 2009-07-01 International Business Machines Corporation Packet switch adapter for variable length packets
US6393026B1 (en) * 1998-09-17 2002-05-21 Nortel Networks Limited Data packet processing system and method for a router
EP0992895A1 (en) * 1998-10-06 2000-04-12 Texas Instruments Inc. Hardware accelerator for data processing systems
JP3504510B2 (en) * 1998-10-12 2004-03-08 日本電信電話株式会社 Packet switch
JP3866425B2 (en) * 1998-11-12 2007-01-10 株式会社日立コミュニケーションテクノロジー Packet switch
US6272522B1 (en) * 1998-11-17 2001-08-07 Sun Microsystems, Incorporated Computer data packet switching and load balancing system using a general-purpose multiprocessor architecture
US6256421B1 (en) * 1998-12-07 2001-07-03 Xerox Corporation Method and apparatus for simulating JPEG compression
JP3704438B2 (en) * 1998-12-09 2005-10-12 株式会社日立製作所 Variable-length packet communication device
US6338078B1 (en) * 1998-12-17 2002-01-08 International Business Machines Corporation System and method for sequencing packets for multiprocessor parallelization in a computer network system
JP3587076B2 (en) 1999-03-05 2004-11-10 松下電器産業株式会社 Packet receiver
US6605001B1 (en) * 1999-04-23 2003-08-12 Elia Rocco Tarantino Dice game in which categories are filled and scores awarded
GB2352536A (en) * 1999-07-21 2001-01-31 Element 14 Ltd Conditional instruction execution
GB2352595B (en) * 1999-07-27 2003-10-01 Sgs Thomson Microelectronics Data processing device
USD428484S (en) * 1999-08-03 2000-07-18 Zirk Todd A Copper roof vent cover
US6631422B1 (en) 1999-08-26 2003-10-07 International Business Machines Corporation Network adapter utilizing a hashing function for distributing packets to multiple processors for parallel processing
US6404752B1 (en) * 1999-08-27 2002-06-11 International Business Machines Corporation Network switch using network processor and methods
US6631419B1 (en) * 1999-09-22 2003-10-07 Juniper Networks, Inc. Method and apparatus for high-speed longest prefix and masked prefix table search
US6963572B1 (en) * 1999-10-22 2005-11-08 Alcatel Canada Inc. Method and apparatus for segmentation and reassembly of data packets in a communication switch
AU5075301A (en) * 1999-10-26 2001-07-03 Arthur D. Little, Inc. Mimd arrangement of simd machines
JP2001177574A (en) * 1999-12-20 2001-06-29 Kddi Corp Transmission controller in packet exchange network
GB2357601B (en) * 1999-12-23 2004-03-31 Ibm Remote power control
US6661794B1 (en) 1999-12-29 2003-12-09 Intel Corporation Method and apparatus for gigabit packet assignment for multithreaded packet processing
ATE280411T1 (en) * 2000-01-07 2004-11-15 Ibm METHOD AND SYSTEM FOR FRAMEWORK AND PROTOCOL CLASSIFICATION
US20030093613A1 (en) * 2000-01-14 2003-05-15 David Sherman Compressed ternary mask system and method
JP2001202345A (en) * 2000-01-21 2001-07-27 Hitachi Ltd Parallel processor
ATE319249T1 (en) * 2000-01-27 2006-03-15 Ibm METHOD AND DEVICE FOR CLASSIFICATION OF DATA PACKETS
US6704794B1 (en) * 2000-03-03 2004-03-09 Nokia Intelligent Edge Routers Inc. Cell reassembly for packet based networks
US20020107903A1 (en) * 2000-11-07 2002-08-08 Richter Roger K. Methods and systems for the order serialization of information in a network processing environment
JP2001251349A (en) * 2000-03-06 2001-09-14 Fujitsu Ltd Packet processor
US7139282B1 (en) * 2000-03-24 2006-11-21 Juniper Networks, Inc. Bandwidth division for packet processing
US7089240B2 (en) * 2000-04-06 2006-08-08 International Business Machines Corporation Longest prefix match lookup using hash function
US7107265B1 (en) * 2000-04-06 2006-09-12 International Business Machines Corporation Software management tree implementation for a network processor
US6718326B2 (en) * 2000-08-17 2004-04-06 Nippon Telegraph And Telephone Corporation Packet classification search device and method
DE10059026A1 (en) 2000-11-28 2002-06-13 Infineon Technologies Ag Unit for the distribution and processing of data packets
GB2370381B (en) * 2000-12-19 2003-12-24 Picochip Designs Ltd Processor architecture
USD453960S1 (en) * 2001-01-30 2002-02-26 Molded Products Company Shroud for a fan assembly
US6832261B1 (en) 2001-02-04 2004-12-14 Cisco Technology, Inc. Method and apparatus for distributed resequencing and reassembly of subdivided packets
GB2407673B (en) 2001-02-14 2005-08-24 Clearspeed Technology Plc Lookup engine
US7856543B2 (en) 2001-02-14 2010-12-21 Rambus Inc. Data processing architectures for packet handling wherein batches of data packets of unpredictable size are distributed across processing elements arranged in a SIMD array operable to process different respective packet protocols at once while executing a single common instruction stream
JP4475835B2 (en) * 2001-03-05 2010-06-09 富士通株式会社 Input line interface device and packet communication device
USD471971S1 (en) * 2001-03-20 2003-03-18 Flettner Ventilator Limited Ventilation cover
CA97495S (en) * 2001-03-20 2003-05-07 Flettner Ventilator Ltd Rotor
US6687715B2 (en) * 2001-06-28 2004-02-03 Intel Corporation Parallel lookups that keep order
US6922716B2 (en) 2001-07-13 2005-07-26 Motorola, Inc. Method and apparatus for vector processing
US7257590B2 (en) * 2001-08-29 2007-08-14 Nokia Corporation Method and system for classifying binary strings
US7283538B2 (en) * 2001-10-12 2007-10-16 Vormetric, Inc. Load balanced scalable network gateway processor architecture
US7317730B1 (en) * 2001-10-13 2008-01-08 Greenfield Networks, Inc. Queueing architecture and load balancing for parallel packet processing in communication networks
US6941446B2 (en) 2002-01-21 2005-09-06 Analog Devices, Inc. Single instruction multiple data array cell
US7382782B1 (en) 2002-04-12 2008-06-03 Juniper Networks, Inc. Packet spraying for load balancing across multiple packet processors
US20030235194A1 (en) * 2002-06-04 2003-12-25 Mike Morrison Network processor with multiple multi-threaded packet-type specific engines
US7200137B2 (en) * 2002-07-29 2007-04-03 Freescale Semiconductor, Inc. On chip network that maximizes interconnect utilization between processing elements
US8015567B2 (en) 2002-10-08 2011-09-06 Netlogic Microsystems, Inc. Advanced processor with mechanism for packet distribution at high line rate
GB0226249D0 (en) * 2002-11-11 2002-12-18 Clearspeed Technology Ltd Traffic handling system
US7656799B2 (en) 2003-07-29 2010-02-02 Citrix Systems, Inc. Flow control system architecture
US7620050B2 (en) 2004-09-10 2009-11-17 Canon Kabushiki Kaisha Communication control device and communication control method
US7787454B1 (en) 2007-10-31 2010-08-31 Gigamon Llc. Creating and/or managing meta-data for data storage devices using a packet switch appliance
JP5231926B2 (en) 2008-10-06 2013-07-10 キヤノン株式会社 Information processing apparatus, control method therefor, and computer program
US8493979B2 (en) * 2008-12-30 2013-07-23 Intel Corporation Single instruction processing of network packets
US8014295B2 (en) 2009-07-14 2011-09-06 Ixia Parallel packet processor with session active checker

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5659781A (en) * 1994-06-29 1997-08-19 Larson; Noble G. Bidirectional systolic ring network
US5923660A (en) * 1996-01-31 1999-07-13 Galileo Technologies Ltd. Switching ethernet controller
US5828858A (en) * 1996-09-16 1998-10-27 Virginia Tech Intellectual Properties, Inc. Worm-hole run-time reconfigurable processor field programmable gate array (FPGA)
US6208619B1 (en) * 1997-03-27 2001-03-27 Kabushiki Kaisha Toshiba Packet data flow control method and device
US6009488A (en) * 1997-11-07 1999-12-28 Microlinc, Llc Computer having packet-based interconnect channel
US6366584B1 (en) * 1999-02-06 2002-04-02 Triton Network Systems, Inc. Commercial network based on point to point radios

Cited By (63)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7055123B1 (en) * 2001-12-31 2006-05-30 Richard S. Norman High-performance interconnect arrangement for an array of discrete functional modules
US20040042496A1 (en) * 2002-08-30 2004-03-04 Intel Corporation System including a segmentable, shared bus
US7360007B2 (en) * 2002-08-30 2008-04-15 Intel Corporation System including a segmentable, shared bus
US20050216625A1 (en) * 2004-03-09 2005-09-29 Smith Zachary S Suppressing production of bus transactions by a virtual-bus interface
US20100241746A1 (en) * 2005-02-23 2010-09-23 International Business Machines Corporation Method, Program and System for Efficiently Hashing Packet Keys into a Firewall Connection Table
US8112547B2 (en) * 2005-02-23 2012-02-07 International Business Machines Corporation Efficiently hashing packet keys into a firewall connection table
US20080276116A1 (en) * 2005-06-01 2008-11-06 Tobias Bjerregaard Method and an Apparatus for Providing Timing Signals to a Number of Circuits, an Integrated Circuit and a Node
US8112654B2 (en) 2005-06-01 2012-02-07 Teklatech A/S Method and an apparatus for providing timing signals to a number of circuits, and integrated circuit and a node
US20070017694A1 (en) * 2005-07-20 2007-01-25 Tomoyuki Kubo Wiring board and manufacturing method for wiring board
US8885673B2 (en) 2005-08-24 2014-11-11 Intel Corporation Interleaving data packets in a packet-based communication system
US20070047584A1 (en) * 2005-08-24 2007-03-01 Spink Aaron T Interleaving data packets in a packet-based communication system
US8325768B2 (en) * 2005-08-24 2012-12-04 Intel Corporation Interleaving data packets in a packet-based communication system
US20090089029A1 (en) * 2007-09-28 2009-04-02 Rockwell Automation Technologies, Inc. Enhanced execution speed to improve simulation performance
US20100318339A1 (en) * 2007-09-28 2010-12-16 Rockwell Automation Technologies, Inc. Simulation controls for model variablity and randomness
US20090089234A1 (en) * 2007-09-28 2009-04-02 Rockwell Automation Technologies, Inc. Automated code generation for simulators
US8548777B2 (en) 2007-09-28 2013-10-01 Rockwell Automation Technologies, Inc. Automated recommendations from simulation
US7801710B2 (en) * 2007-09-28 2010-09-21 Rockwell Automation Technologies, Inc. Simulation controls for model variability and randomness
US20090089227A1 (en) * 2007-09-28 2009-04-02 Rockwell Automation Technologies, Inc. Automated recommendations from simulation
US8417506B2 (en) 2007-09-28 2013-04-09 Rockwell Automation Technologies, Inc. Simulation controls for model variablity and randomness
US20090089027A1 (en) * 2007-09-28 2009-04-02 Rockwell Automation Technologies, Inc. Simulation controls for model variablity and randomness
US20090089030A1 (en) * 2007-09-28 2009-04-02 Rockwell Automation Technologies, Inc. Distributed simulation and synchronization
US8069021B2 (en) 2007-09-28 2011-11-29 Rockwell Automation Technologies, Inc. Distributed simulation and synchronization
US20090089031A1 (en) * 2007-09-28 2009-04-02 Rockwell Automation Technologies, Inc. Integrated simulation of controllers and devices
US7995618B1 (en) * 2007-10-01 2011-08-09 Teklatech A/S System and a method of transmitting data from a first device to a second device
US20090268736A1 (en) * 2008-04-24 2009-10-29 Allison Brian D Early header CRC in data response packets with variable gap count
US20090268727A1 (en) * 2008-04-24 2009-10-29 Allison Brian D Early header CRC in data response packets with variable gap count
US20090271532A1 (en) * 2008-04-24 2009-10-29 Allison Brian D Early header CRC in data response packets with variable gap count
US8811430B2 (en) * 2009-04-29 2014-08-19 Intel Corporation Packetized interface for coupling agents
US9736276B2 (en) * 2009-04-29 2017-08-15 Intel Corporation Packetized interface for coupling agents
US20120176909A1 (en) * 2009-04-29 2012-07-12 Mahesh Wagh Packetized Interface For Coupling Agents
US8170062B2 (en) * 2009-04-29 2012-05-01 Intel Corporation Packetized interface for coupling agents
US20140307748A1 (en) * 2009-04-29 2014-10-16 Mahesh Wagh Packetized Interface For Coupling Agents
US20100278195A1 (en) * 2009-04-29 2010-11-04 Mahesh Wagh Packetized Interface For Coupling Agents
US8823495B2 (en) * 2010-03-12 2014-09-02 Zte Corporation Sight spot guiding system and implementation method thereof
US20130038427A1 (en) * 2010-03-12 2013-02-14 Zte Corporation Sight Spot Guiding System and Implementation Method Thereof
US20130229290A1 (en) * 2012-03-01 2013-09-05 Eaton Corporation Instrument panel bus interface
CN104144827A (en) * 2012-03-01 2014-11-12 伊顿公司 Instrument panel bus interface
US20150012679A1 (en) * 2013-07-03 2015-01-08 Iii Holdings 2, Llc Implementing remote transaction functionalities between data processing nodes of a switched interconnect fabric
US10243882B1 (en) 2017-04-13 2019-03-26 Xilinx, Inc. Network on chip switch interconnect
US10673745B2 (en) 2018-02-01 2020-06-02 Xilinx, Inc. End-to-end quality-of-service in a network-on-chip
US10503690B2 (en) 2018-02-23 2019-12-10 Xilinx, Inc. Programmable NOC compatible with multiple interface communication protocol
US10621129B2 (en) 2018-03-27 2020-04-14 Xilinx, Inc. Peripheral interconnect for configurable slave endpoint circuits
US10505548B1 (en) 2018-05-25 2019-12-10 Xilinx, Inc. Multi-chip structure having configurable network-on-chip
US11263169B2 (en) 2018-07-20 2022-03-01 Xilinx, Inc. Configurable network-on-chip for a programmable device
US10838908B2 (en) 2018-07-20 2020-11-17 Xilinx, Inc. Configurable network-on-chip for a programmable device
US10824505B1 (en) 2018-08-21 2020-11-03 Xilinx, Inc. ECC proxy extension and byte organization for multi-master systems
US10963460B2 (en) 2018-12-06 2021-03-30 Xilinx, Inc. Integrated circuits and methods to accelerate data queries
US20200250281A1 (en) * 2019-02-05 2020-08-06 Arm Limited Integrated circuit design and fabrication
US10796040B2 (en) * 2019-02-05 2020-10-06 Arm Limited Integrated circuit design and fabrication
US10936486B1 (en) 2019-02-21 2021-03-02 Xilinx, Inc. Address interleave support in a programmable device
US10680615B1 (en) 2019-03-27 2020-06-09 Xilinx, Inc. Circuit for and method of configuring and partially reconfiguring function blocks of an integrated circuit device
US20220027308A1 (en) * 2019-05-09 2022-01-27 SambaNova Systems, Inc. Control Barrier Network for Reconfigurable Data Processors
US11580056B2 (en) * 2019-05-09 2023-02-14 SambaNova Systems, Inc. Control barrier network for reconfigurable data processors
US11188312B2 (en) 2019-05-23 2021-11-30 Xilinx, Inc. Hardware-software design flow with high-level synthesis for heterogeneous and programmable devices
US10891414B2 (en) 2019-05-23 2021-01-12 Xilinx, Inc. Hardware-software design flow for heterogeneous and programmable devices
US10891132B2 (en) 2019-05-23 2021-01-12 Xilinx, Inc. Flow convergence during hardware-software design for heterogeneous and programmable devices
US11301295B1 (en) 2019-05-23 2022-04-12 Xilinx, Inc. Implementing an application specified as a data flow graph in an array of data processing engines
US11645053B2 (en) 2019-05-23 2023-05-09 Xilinx, Inc. Hardware-software design flow with high-level synthesis for heterogeneous and programmable devices
US10977018B1 (en) 2019-12-05 2021-04-13 Xilinx, Inc. Development environment for heterogeneous devices
US11496418B1 (en) 2020-08-25 2022-11-08 Xilinx, Inc. Packet-based and time-multiplexed network-on-chip
US11336287B1 (en) 2021-03-09 2022-05-17 Xilinx, Inc. Data processing engine array architecture with memory tiles
US11520717B1 (en) 2021-03-09 2022-12-06 Xilinx, Inc. Memory tiles in data processing engine array
US11848670B2 (en) 2022-04-15 2023-12-19 Xilinx, Inc. Multiple partitions in a data processing array

Also Published As

Publication number Publication date
US20070220232A1 (en) 2007-09-20
GB2389689B (en) 2005-06-08
US20050243827A1 (en) 2005-11-03
JP2004525449A (en) 2004-08-19
GB2390506B (en) 2005-03-23
GB2374442A (en) 2002-10-16
US7917727B2 (en) 2011-03-29
GB2377519B (en) 2005-06-15
US20110083000A1 (en) 2011-04-07
GB2374443B (en) 2005-06-08
GB0203632D0 (en) 2002-04-03
US20040130367A1 (en) 2004-07-08
GB2374443A (en) 2002-10-16
AU2002233500A1 (en) 2002-08-28
US8200686B2 (en) 2012-06-12
US7290162B2 (en) 2007-10-30
JP2004524617A (en) 2004-08-12
GB0319801D0 (en) 2003-09-24
US8127112B2 (en) 2012-02-28
US20070217453A1 (en) 2007-09-20
CN1504035A (en) 2004-06-09
WO2002065259A1 (en) 2002-08-22
GB0203633D0 (en) 2002-04-03
US7818541B2 (en) 2010-10-19
GB2389689A (en) 2003-12-17
US20030041163A1 (en) 2003-02-27
US20020161926A1 (en) 2002-10-31
CN100367730C (en) 2008-02-06
GB0321186D0 (en) 2003-10-08
CN1613041A (en) 2005-05-04
GB2377519A (en) 2003-01-15
US20050242976A1 (en) 2005-11-03
WO2002065700A2 (en) 2002-08-22
GB2390506A (en) 2004-01-07
US7856543B2 (en) 2010-12-21
GB0203634D0 (en) 2002-04-03
US20020159466A1 (en) 2002-10-31
GB2374442B (en) 2005-03-23
WO2002065700A3 (en) 2002-11-21

Similar Documents

Publication Publication Date Title
US20040114609A1 (en) Interconnection system
EP3776231B1 (en) Procedures for implementing source based routing within an interconnect fabric on a system on chip
EP3400688B1 (en) Massively parallel computer, accelerated computing clusters, and two dimensional router and interconnection network for field programmable gate arrays, and applications
US8599863B2 (en) System and method for using a multi-protocol fabric module across a distributed server interconnect fabric
US9680770B2 (en) System and method for using a multi-protocol fabric module across a distributed server interconnect fabric
US10608640B1 (en) On-chip network in programmable integrated circuit
US10707875B1 (en) Reconfigurable programmable integrated circuit with on-chip network
US11336287B1 (en) Data processing engine array architecture with memory tiles
RU2283507C2 (en) Method and device for configurable processor
US11730325B2 (en) Dual mode interconnect
US20040100900A1 (en) Message transfer system
US11520717B1 (en) Memory tiles in data processing engine array
US8571016B2 (en) Connection arrangement
US20070245044A1 (en) System of interconnections for external functional blocks on a chip provided with a single configurable communication protocol
Nejad et al. An FPGA bridge preserving traffic quality of service for on-chip network-based systems
US10990552B1 (en) Streaming interconnect architecture for data processing engine array
Bianchini et al. The Tera project: A hybrid queueing ATM switch architecture for LAN
Aust et al. Real-time processor interconnection network for fpga-based multiprocessor system-on-chip (mpsoc)
Guruprasad et al. An Efficient Bridge Architecture for NoC Based Systems on FPGA Platform
Rekha et al. Analysis and Design of Novel Secured NoC for High Speed Communications
REDDY et al. Design of Reconfigurable NoC Architecture for Low Area and Low Power Applications
Sha et al. Design of Cloud Server Based on Godson Processors
Ferrer et al. Quality of Service in NoC for Reconfigurable Space Applications
Khan et al. Design and implementation of an interface control unit for rapid prototyping
Sharma et al. Fpga cluster based high throughput architecture for cryptography and cryptanalysis

Legal Events

Date Code Title Description
AS Assignment

Owner name: CLEARSPEED TECHNOLOGY LIMITED, UNITED KINGDOM

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:SWARBRICK, IAN;WINSER, PAUL;RYAN, STUART;REEL/FRAME:015007/0932;SIGNING DATES FROM 20030911 TO 20040116

AS Assignment

Owner name: CLEARSPEED SOLUTIONS LIMITED, UNITED KINGDOM

Free format text: CHANGE OF NAME;ASSIGNOR:CLEARSPEED TECHNOLOGY LIMITED;REEL/FRAME:015317/0484

Effective date: 20040701

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION