US20040114609A1 - Interconnection system - Google Patents
Interconnection system Download PDFInfo
- Publication number
- US20040114609A1 US20040114609A1 US10/468,167 US46816704A US2004114609A1 US 20040114609 A1 US20040114609 A1 US 20040114609A1 US 46816704 A US46816704 A US 46816704A US 2004114609 A1 US2004114609 A1 US 2004114609A1
- Authority
- US
- United States
- Prior art keywords
- interconnection system
- node
- bus
- data
- interconnection
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L45/00—Routing or path finding of packets in data switching networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/04—Generating or distributing clock signals or signals derived directly therefrom
- G06F1/10—Distribution of clock signals, e.g. skew
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored program computers
- G06F15/80—Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
- G06F15/8007—Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors single instruction multiple data [SIMD] multiprocessors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/30—Circuit design
- G06F30/32—Circuit design at the digital level
- G06F30/327—Logic synthesis; Behaviour synthesis, e.g. mapping logic, HDL to netlist, high-level language to RTL or netlist
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L12/00—Data switching networks
- H04L12/54—Store-and-forward switching systems
- H04L12/56—Packet switching systems
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L45/00—Routing or path finding of packets in data switching networks
- H04L45/74—Address processing for routing
- H04L45/742—Route cache; Operation thereof
Definitions
- the present invention relates to an interconnection network. In particular, but not exclusively, it relates to an intra chip interconnection network.
- a typical bus system for interconnecting a plurality of functional units consists of either a set of wires with tri-state drivers, or two uni-directional data-paths incorporating multiplexers to get data onto the bus. Access to the bus is controlled by an arbitration unit, which accepts requests to use the bus, and grants one functional unit access to the bus at any one time.
- the arbiter may be pipelined, and the bus itself may be pipelined in order to achieve a higher clock rate.
- the system may comprise a plurality of routers which typically comprise a look up table. The data is then compared with the entries within the routing look up table in order to route the data onto the bus to its correct destination.
- Another form of interconnection is a direct interconnection network.
- These types of networks typically comprise a plurality of nodes, each of which is a discrete router chip.
- Each node (router) may connect to a processor and to a plurality of other nodes to form a network topology.
- the object of the present is to provide an interconnection network as an on-chip bus system. This achieved by routing data on the bus as opposed to broadcasting data.
- the routing of the data being achieved by a simple addressing scheme in which each transaction has routing information associated therewith, for example a geographical address, which enables the nodes within the interconnection network to route the transaction to its correct destination.
- the routing information contains information on the direction to send the data packet.
- This routing information is not merely an address of a destination but provides directional information, for example x,y coordinates of a grid to give direction.
- the nodes do not need routing table(s) or global signals to determine the direction since all the information the node needs is contains in the routing information of the data packets. This enables the circuitry of the node and the interconnection system to be simplified making integrated of the system onto a chip feasible.
- each functional unit is connected to a node, and all nodes are connected together, then a pipeline connection will exist between each pair of nodes in the system.
- the number of intervening nodes will govern the number of pipeline stages. If there is pair of joined nodes where the distance between them is too great to transmit data within a single clock cycle, a repeater block can be inserted between the nodes. This block registers the data, while maintaining the same protocol as the other bus blocks. The inclusion of the repeater blocks allows interconnection of arbitrary length to be created.
- the interconnection system according the present invention can be utilised in an intra-chip interconnection network.
- Data transfers are all packetized, and the packets may be of any length that is a multiple of the data-path width.
- the nodes of the bus used to create the interconnection network nodes and T-switches all have registers on the data-path(s).
- the main advantage of the present invention is that it is inherently re-usable.
- the implementer need only instantiate enough functional blocks to form an interconnection of the correct length, with the right number of interfaces, and with enough repeater blocks to achieve the desired clock rate.
- the interconnection system in accordance with the present invention employs distributed arbitration.
- the arbitration capability grows as more blocks are added. Therefore, if the bus needs to be lengthened, it is a simple matter of instantiating more nodes and possibly repeaters. Since each module manages its own arbitration within itself, the overall arbitration capability of the interconnect increases. This makes the bus system of the present invention more scalable (in length and overall bandwidth) than other conventional bus systems.
- the interconnection in accordance with the present invention is efficient in terms of power consumption. Since packets are routed, rather than broadcast, only the wires between the source and destination node are toggled. The remaining bus drivers are clock-gated. Hence the system of the present invention consumes less power.
- every node on the bus has a unique address associated with it; an interface address.
- a field in the packet is reserved to hold a destination interface address.
- Each node on the bus will interrogate this field of an incoming packet; if it matches its interface address it will route the packet off the interconnection (or bus), if it does not match it will route the packet down the bus.
- the addressing scheme could be extended to support “wildcards” for broadcast messages; if a subset of the address matches the interface address then the packet is routed off the bus and passed on down the bus, otherwise it is just sent on down the bus.
- each interface unit For packets coming on to the bus, each interface unit interrogates the destination interface address of the packet. This is used to decide which direction a packet arriving on the bus from an attached unit is to be routed. In the case of a linear bus, this could be a simple comparison: if the destination address is greater than the interface address of the source of the data then the packets routed “up” the bus, otherwise the packet is routed “down” the bus. This could be extended to each interface unit such that each node maps destination addresses, or ranges of addresses, to directions on the bus.
- the interface unit sets a binary lane signal based on the result of this comparison.
- functionality is split between the node and interface unit. All “preparation” of the data to be transported (including protocol requirements) is carried out in the interface unit. This allows greater flexibility as the node is unchanging irrespective of the type of data to be transported, allowing the node to be re-used in different circuits. More preferably the node directs the packet off the interconnection system to a functional unit.
- the interface unit can carry out the following functions: take the packet from the functional unit, ensure a correct destination module ID, head and tail bit; compare the destination module ID to the local module ID and sets a binary lane signal based on the result of this comparison; pack the module ID, data and any high level (non bus) control signals into a flit; implement any protocol change necessary; and pass the lane signal and flit to the node using the inject protocol.
- a T-junction or switch behaves in a similar way; the decision here is simply whether to route the packet down one branch or the other. This would typically be done for ranges of addresses; if the address is larger than some predefined value then the packets are routed left, otherwise they are routed right. However, more complex routing schemes could be implemented if required.
- the addressing scheme can be extended to support inter-chip communication.
- a field in the address is used to define a target chip address with, for example, 0 in this field representing a local address of the chip.
- this field will be compared with the pre-programmed address of the chip. If they match then the field is set to zero and the local routing operates as above. If they do not match, then the packet is routed along the bus to the appropriate inter-chip interface in order to be routed towards its final destination.
- This scheme could be extended to allow a hierarchical addressing scheme to manage routing across systems, sub-systems, boards, groups of chips, as well as individual chips.
- the system according to the present invention is not suitable for all bus-type applications.
- the source and destination transactors are decoupled, since there is no central arbitation point.
- the advantage of the approach of the present invention is that long buses (networks) can be constructed, with very high aggregrate bandwidth.
- the system of the present invention is protocol agnostic.
- the interconnection of the present invention merely transports data packets.
- Interface unit in accordance with the present invention manage all protocol specific features. This means that it easy to migrate to a new protocol, since only the interface units need to be re-designed.
- the present invention also provides flexible topology and length.
- the repeater blocks of the present invention allow very high clock rates in that the overall modular structure of the interconnection prevents the clock rate being limited by long wires. This simplifies the synthesis and layout.
- the repeater blocks not only pipeline the data as it goes downstream but they implement a flow control protocol; pipeline blockage information up the interconnection (or bus) (rather than a blocking signal being distributed globally).
- a feature of this mechanism is that data compression (inherent buffering) is achieved on the bus at least double the latency figure, i.e. if latency through the repeater is one cycle then two data control flow digits (flits: the basic unit of data transfer over the interconnection of the present invention. It includes n bytes of data, as well as some side-band control signals.
- the flow control digit size equals the size of the bus data-path) will concatenate when it is blocked. This means that the scope of any blocking is minimised and thus reducing any queuing requirement in a functional block.
- the flow of data flow control digits is managed by a flow control protocol, in conjunction with double-buffering in a store block (and repeater unit) as described previously.
- Customised interface units handle protocol specific features. These typically involve packing and unpacking of control information and data, and address translation. A customised interface unit can be created to match any specific concurrency.
- Packets are injected (gated) onto the interconnection at each node, so that each node is allocated a certain amount of the overall bandwidth allocation (e.g. by being able to send, say 10 flow control digits within every 100 cycles). This distributed scheme controls the overall bandwidth allocation.
- FIG. 1 is a block schematic diagram of the system incorporating the interconnection system according to an embodiment of the present invention
- FIG. 2 is a block schematic diagram illustrating the initiator and target of a virtual component interface system of FIG. 1,
- FIG. 3 is a block schematic diagram of the node of the interconnection system shown in FIG. 1;
- FIG. 4 is a block schematic diagram of connection over the interconnection according to the present invention between virtual components of the system shown in FIG. 1;
- FIG. 5 a is a diagram of the typical structure of the T-switch of FIG. 1;
- FIG. 5 b is a diagram showing the internal connection of the T-switch of FIG. 5 a;
- FIG. 6 illustrates the Module ID (interface ID) encoding of the system of FIG. 1;
- FIG. 7 illustrates handshaking signals in the interconnection system according to an embodiment of the present invention
- FIG. 9 illustrates blocking for two cycles of the interconnection system according to an embodiment of the present invention:
- FIG. 10 illustrates virtual component interface handshake according an embodiment of the present invention
- FIG. 11 illustrates a linear chip arrangement of the system according to an embodiment of the present invention
- FIG. 12 is a schematic block diagram of the interconnection system of the present invention illustrating an alternative topogly
- FIG. 13 is a schematic block diagram of the interconnection system of the present invention illustrating a further alternative topogly
- FIG. 14 illustrates an example of a traffic handling subsystem according to an embodiment of the present invention
- FIG. 15 illustrates a system for locating chips on a virtual grid according to a method of a preferred embodiment of the present invention.
- FIG. 16 illustrates routing a transaction according to the method of a preferred embodiment of the present invention.
- the basic mechanism for communicating data and control information between functional blocks is that blocks exchange messages using the interconnection system 100 according to the present invention.
- the bus system can be extended to connect blocks in a multi chip system, and the same mechanism works for blocks within a chip or blocks on different chips.
- An example of a system 100 incorporating the interconnection system 110 according to an embodiment of the present invention, as shown in FIG. 1, comprises a plurality of reusable on-chip functional blocks or virtual component blocks 105 a , 105 b and 105 c .
- These functional units interface to the interconnection and can be fixed. They can be re-used at various levels of abstraction (eg. RTL, gate level, GDSII layout data) in different circuit design. The topology can be fixed once the size, aspect ratio and the location of the I/O's to the interconnection are known.
- Each on-chip functional unit 105 a , 105 b , 105 c are connected to the interconnection system 110 via its interface unit.
- the interface unit handles address decoding and protocol translation.
- the on-chip functional block 105 a for example, is connected to the interconnection system 110 via an associated virtual component interface initiator 115 a and peripheral virtual component interface initiator 120 a.
- the on-chip functional block 105 b is connected to the interconnection system 110 via an associated virtual component interface target 125 b and peripheral virtual component interface target 130 b.
- the on-chip functional block 105 c is connected to the interconnection system 110 via an associated virtual component interface initiator 115 c and peripheral virtual component interface target 130 c.
- the associated initiators and targets for each on-chip functional block shown in FIG. 1 are purely illustrative and may vary depending on the associated block requirements.
- a functional block may have a number of connections to the interconnection system. Each connection has an advanced virtual component interface (extensions forming a superset of basic virtual component interface. This is the protocol used for the main data interfaces in the system of the present invention) or peripheral virtual component interface interface (low bandwidth interface allowing atomic operations, mainly used in the present invention for control register access).
- Virtual component interface is an OCB standard interface to communicate between a bus and/or virtual component, which is independent of any specific bus or virtual component protocol.
- peripheral virtual component interface 120 a , 130 b , 130 c There are three types of virtual component interfaces, peripheral virtual component interface 120 a , 130 b , 130 c , basic virtual component interface and advanced virtual component interface.
- the basic virtual component interface is a wider, higher bandwidth interface than the peripheral virtual component interface.
- the basic virtual component interface allows split transactions. Split transaction are where the request for data and the response are decoupled, so that a request for data does not need to wait for the response to be returned before initiating further transactions.
- Advanced virtual component interface is a superset of basic virtual component interface; Advanced virtual component interface and peripheral virtual component interface have been adopted in the system according to the embodiment of the present invention.
- the advanced virtual component interface unit comprises a target and initiator.
- the target and initiator are virtual components that send request packets and receive response packets.
- the initiator is the agent that initiates transactions, for example, DMA (or EPU on F150).
- an interface unit that initiates a read or write transaction is called an initiator 210 (issues a request 220 ), while an interface that receives the transaction is called the target 230 (responds to a request 240 ).
- an initiator 210 issues a request 220
- an interface that receives the transaction is called the target 230 (responds to a request 240 ).
- each no-chip functional block 105 a , 105 b and 105 c and its associated initiators and targets are made using virtual component interface protocol.
- Each initiator 115 a , 120 a , 115 c and target 125 b , 130 b , 130 c are connected to a unique node 135 , 140 , 145 , 150 , 155 and 160 .
- Communication between each initiator 115 a , 120 a , 115 c and target 125 b , 130 b , 130 c uses the protocol in accordance with the embodiment of the present invention and as described in more detail below.
- the interconnection system 110 comprises three separate buses 165 , 170 and 175 .
- the RTL components have parameterisable widths, so these may be three instances of different width.
- An example might be a 64-bit wide peripheral virtual component bus 170 (32 bit address+32 data bits), a 128-bit advanced virtual component interface bus 165 , and a 256-bit advanced virtual component interface bus 175 .
- three separate buses are illustrated here, it is appreciated that the interconnection system of the present invention may incorporate any number of separate buses.
- a repeater unit 180 may be inserted for all the buses 165 , 170 and 175 . There is no restriction on the length of the buses 165 , 170 and 175 . Variations in the length of the buses 165 , 170 and 170 would merely require an increased number of repeater units 180 . Repeater units would of course only be required when the timing contraints between two nodes cannot be met due to the length of wire of the interconnection.
- T-switches (3-way connectors or the like) 185 can be provided.
- the interconnection system of the present invention can be used in any topology but care should be taken when the topology contains loops as deadlock may result.
- Data is transferred on the interconnection network of the present invention in packets.
- the packets may be of any length that is a multiple of the data-path width.
- the nodes 135 , 140 , 145 , 150 , 155 and 160 according to the present invention used to create the interconnection network (node and T-switch) all have registers on the data-path(s).
- Each interface unit is connected to a node within the interconnection system itself, and therefore to one particular lane of the bus. Connections may be of initiator or target type, but not both—following from the conventions of virtual component interface. In practise every block is likely to have a peripheral virtual component interface target interface for configuration and control.
- bus components according to the embodiment of the present invention use distributed arbitration, where each block in the bus system manages access to its own resources.
- a node 135 according to the embodiment of the present invention is illustrated in FIG. 3.
- Each node 135 , 140 , 145 , 150 , 155 and 160 are substantially similar.
- Node 135 is connected to the bus 175 of FIG. 1.
- Each node comprises a first and second input store 315 , 320 .
- the first input store 315 has an input connected to a first bus lane 305 .
- the second input store 320 has an input connected to a second bus lane 310 .
- the output of the first input store 315 is connected to a third bus lane 306 and the output of the second input store 320 is connected to a fourth bus lane 311 .
- Each node further comprises an inject control unit 335 and a consume control unit 325 .
- the node may not require the consume arbitration, for example the node may have an output for each uni-directional lane but the consume handshaking retained.
- the input of the inject control unit 335 is connected to the output of an interface unit of the respective functional unit for that node.
- the outputs of the inject control unit 335 are connected to a fifth bus lane 307 arid sixth bus lane 312 .
- the input of the consume control unit 325 is connected to the output of a multiplexer 321 .
- the inputs of the multiplexer 321 are connected to the fourth bus lane 311 and the third bus lane 306 .
- the output of the consume control unit 325 is connected to a bus 330 which is connected to the interface unit of the respective functional unit for that node.
- the fifth bus lane 307 and the third bus lane 306 are connected to the inputs of a multiplexer 308 .
- the output of the multiplexer 308 is connected to the first bus lane 305 .
- the fourth bus lane 311 and the sixth bus lane 312 are connected to the inputs of a multiplexer 313 .
- the output of the multiplexer 313 is connected to the second bus lane 310 .
- the nodes are the connection points where data leaves or enters the bus. It also forms part of the transport medium.
- the node forms part of the bus lane which it connects to, including both directions of data path.
- the node conveys data on the lane to which i connects, with one cycle of latency when not blocked. It also allows the connecting functional block to inject and consume data in either direction, via its interface unit. Arbitration of injected or passing data is performed entirely within the node.
- bus 175 consists of a first lane 305 and a second lane 310 .
- the first and seond lanes 305 and 310 are physically separate unidirectional buses that are multiplexed and de-multiplexed to the same interfaces within the node 135 .
- the direction of data flow of the first lane 305 is in the opposite direction to that of the second lane 310 .
- Each lane 305 and 310 has a lane number.
- the lane number is a parameter that is passed from the interface unit to the node to determine which lane (and hence which direction) each packet is sent to.
- the direction of the data flow of the first and second lanes 305 and 310 can be in the same direction. This would be desirable if the blocks transacting on the bus only need to send packets in one direction.
- the node 135 is capable of concurrently receiving and injecting data on the same bus lane. At the same time it is possible to pass data through on the other lane.
- Each uni-directional lane 305 , 310 carries a separate stream 306 , 307 , 311 , 312 of data. These streams 306 , 307 , 311 , 312 are multiplexed together at the point 321 where data leaves the node 135 into the on-chip module 105 a (not shown here) via the interface unit 115 a and 120 a (not shown here).
- the data streams 306 , 307 , 311 , 312 are de-multiplexed from the on-chip block 105 a onto the bus lanes 305 and 310 in order to place data on the interconnection 110 .
- Each lane can independently block or pass data through. Data can be consumed from one lane at a time, and injected on one lane at the same time. Concurrent inject and consume on the same lane is also permitted. Which lane each packet is injected on is determined within the interface unit.
- Each input store (or register) 315 and 320 registers the data as it passes from node to node.
- Each store 315 , 320 contains two flit-wide registers. When there is no competition for bus resources, only one of the registers is used. When the bus blocks, both registers are then used. It also implements the ‘block on header’ feature. This is needed to allow packets to be blocked at the header flit so that a new packet can be injected onto the bus.
- the output interface unit 321 , 325 multiplexes both bus lanes 305 , 310 onto one lane 330 that feeds into the on-chip functional unit 105 a via the interface unit which is connected to the node 135 .
- the output interface unit 321 , 325 also performs an arbitration function, granting one lane access to the on-chip functional unit, while blocking the other.
- Each node also comprises an input interface unit 335 .
- the input interface unit 335 performs de-multiplexing of packets onto one of the bus lanes 305 , 310 . It also performs an arbitration function, blocking the packet that is being input until the requested lane is available.
- a plurality of repeater units 180 are provided at intervals along the length of the interconnection 110 .
- Each repeater unit 180 is used to introduce extra registers on the data path. It adds an extra cycle of latency, but is only used where there is a difficulty meeting timing constraints.
- Each repeater unit 180 comprises a store similar to the store unit of the nodes. The store unit merely passes data onwards, and implements blocking behaviour. There is no switching carried out in the repeater unit.
- the repeater block allows for more freedom in chip layout. For example, it allows long length of wires between nodes or where a block has a single node to connect to a single lane, repeaters may be inserted into the other lanes in order to produce uniform timing characteristics over all lanes. There may be more than one repeater between two nodes.
- the system according to the embodiment of the present invention is protocol agnostic, that is to say, the data-transport blocks such as the nodes 135 , 140 , 145 , 150 , 155 , 160 , repeater units 180 and T-switch 185 simply route data packets from a source interface to a destination interface. Each packet will contain control information and data. The packing and unpacking of this information is performed in the interface units 115 a , 120 a , 125 b , 130 b , 115 c , 130 c . In respect of the preferred embodiment, these interface units are virtual component interfaces, but it is appreciated that any other protocol could be supported by creating customised interface units.
- a large on-chip block may have several interfaces to the same bus.
- the target and initiator 15 a , 120 a , 125 b , 130 b , 115 c , 130 c of the interface units perform conversion between the advanced virtual component interface and bus protocols in the initiator and from the bus to advanced virtual component interface in the target.
- the protocol is an asynchronous handshake on the advanced virtual component interface side illustrated in FIG. 10.
- the interface unit initiator comprises a send path. This path performs conversion between the advanced virtual component interface communication protocol to the bus protocol. It extracts a destination module ID or interface ID.
- a block may be connected to several buses, with a different module (interface) ID on each bus address, packs it into the correct part of the packet, and uses the module ID in conjunction with a hardwired routing table to generate a lane number (e.g. 1 for right, 0 for left).
- the initiator blocks the data at the advanced virtual component interface when it cannot be sent onto the bus.
- the interface unit initiator also comprises a response path. The response path receives previously requested data, converting from bus communication protocol to the virtual component interface protocol. It blocks data on the bus if the on-chip virtual component block is unable to receive it.
- the interface unit target comprises a send path which receives incoming read and write requests.
- the target converts from bus communication protocol to advanced virtual component interface protocol. It blocks data on the bus if it cannot be accepted across the virtual component interface.
- the target also comprises a response path which carries read (and for verification purposes, write) requests. It converts advanced virtual component interface communication protocol to bus protocol and blocks data at the advanced virtual component interface if it cannot be sent onto the bus.
- the other type of interface unit utilised in the embodiment of the present invention is a peripheral virtual component unit.
- the main differences between the peripheral virtual component interface and the advanced virtual component interface are the data interface of the peripheral virtual component interface is potentially narrower (up to 4 bytes) than the advanced virtual component interface and the peripheral virtual component interface is not split transaction.
- the peripheral virtual component interface units perform conversion between the peripheral virtual component interface and bus protocols in the initiator, and from the bus protocol to peripheral virtual component interface protocol in the target.
- the protocol is an asynchronous handshake on the peripheral virtual component interface side.
- the interface unit initiator comprises a send path. It generates destination module ID and the transport lane number from memory address. The initiator blocks the data at the peripheral virtual component interface when it cannot be sent onto the bus. The initiator also comprises a response path. This path receives previously requested data, converting from bus communication protocol to the peripheral virtual component interface protocol. It also blocks data on the bus if the on-chip block (virtual component block) is unable to receive it.
- the peripheral virtual component interface unit target comprises a send path which receives incoming read and write requests. It blocks data on the bus if it cannot be accepted across the virtual component interface.
- the target also comprises a response path which carries read (and for verification purposes, write) requests. It converts peripheral virtual component interface communication protocol to bus protocol and blocks data at the virtual component interface if it cannot be sent onto the bus.
- the peripheral virtual component interface initiator may comprise a combined initiator and target. This is so that the debug registers (for example) of an initiator can be read from.
- the virtual component (on-chip) blocks can be connected to each other over the interconnection system according to the present invention.
- a first virtual component (on-chip) block 425 is connected point to point to an interface unit target 430 .
- the interface unit target 430 presents a virtual component initiator interface 440 to the virtual component target 445 of on-chip block 425 .
- the interface unit target 430 uses a bus protocol conversion unit 448 to interface to the bus interconnect 450 .
- the interface unit initiator 460 presents a target interface 470 to the initiator 457 of the second on-chip block 455 and, again, uses a bus protocol conversion unit 468 on the other side.
- the T-switch 185 of FIG. 1 is a block that joins 3 nodes, allowing more complex interconnects than simple linear ones.
- the interface ID of each packet is decoded and translated into a single bit, which represents the two possible outgoing ports.
- a hardwired table inside the T-Switch performs this decoding. There is one such table for each input port on the T-Switch. Arbitration takes place for the output ports if there is a conflict. The winner may send the current packet, but must yield when the packet has been sent.
- FIGS. 5 a and 5 b show an example of the structure of a T-switch.
- the T-switch comprises three sets of input/output ports 505 , 510 , 515 connected to each pair of unidirectional bus lanes 520 , 525 , 530 .
- a T-junction 535 , 540 , 545 is provided for each pair of bus lanes 520 , 525 , 530 such that an incoming bus 520 coming into an input port 515 can be output via output port 505 or 510 , for example.
- the T-switch 185 comprises a lane selection unit.
- the lane selection unit takes in module ID of incoming packets and produces a 1-bit result corresponding to the two possible output ports on the switch.
- the T-switch also comprises a store block on each input lane. Each store block stores data flow control digits and allows them to block in place if the output port is temporarily unable to receive. It also performs a block on header function, which allows switching to occur at the packet level (rather than the flow control digit level).
- the T-switch also includes an arbiter for arbitration between requests to use output ports.
- the interconnection system powers up into a usable state. Routing information is hardcoded into the bus components.
- a destination module interface ID (mod ID) for example as illustrated in FIG. 6 is all that is required to route a packet to another node. In order for that node to return a response packet, it must have been sent the module interface ID of the sender.
- Every interconnection may be more than one interconnection in a processing system.
- every interface (which includes an inject and consume port) has a unique ID. These ID's are hard-coded at silicon compile-time.
- Units attached to the bus are free to start communicating straight after reset.
- the interface unit will hold off communications (by not acknowledging them) until it is ready to begin operation.
- the interconnection system has an internal protocol that is used throughout. At the interfaces to the on-chip blocks this may be converted to some other protocol, for example virtual component interface as described above.
- the internal protocol will be referred to as the bus protocol This bus protocol allows single cycle latency for packets travelling along the bus when there is no contention for resources, and to allow packets to block in place when contention occurs.
- the bus protocol is used for all internal (non interface/virtual component interface) data transfers. It consists of five signals: occup 705 , head 710 , tail 715 , data 720 and valid 725 between a sender 735 and a receiver 730 . These are shown in FIG. 7.
- the packets consist of one or more flow control digits. On each cycle that the sender asserts the valid signal, the receiver must accept the data on the next positive clock edge.
- the receiver 730 informs the sender 735 about its current status using the occup signal 705 .
- This is a two-bit wide signal. TABLE I Occup signal values and their meaning.
- the occup signal 705 tells the sender 735 if and when it is able to send data. When the sender 735 is allowed to transmit a data flow control digit, it is qualified with a valid signal 725 .
- Each node and T-Switch use these signals to perform switching at the packet level.
- FIG. 8 shows an example of blocking behaviour on the interconnect system according to an embodiment of the present invention.
- the occup signal is set to ‘1’, meaning ‘if sending a flow control digit this cycle, don't send one on the next cycle’.
- FIG. 9 shows an example of the blocking mechanism more completely.
- the occup signal is set to 01 (binary), then to 10 (binary).
- the sender can resume transmitting flow control digits when the occup signal is set back to 01—at that point it is not currently sending a flow control digit, so it is able to send one on the next cycle.
- the protocol at the boundary between the node and the interface unit is different from that just described, and is similar to that used by the virtual component interface.
- the consume (bus output) protocol is different to the inject protocol but is the minimum logic to allow registered outputs (and thus simplifies synthesis and integration into a system on chip).
- the bus protocol allows exchange packets consisting of one or more flow control digits. Eight bits in the upper part of the first packet carry the destination module ID, and are used by the bus system to deliver the packet. The top 2 bits are also used for internal bus purposes. In all other bit fields, the packing of the flow control digits is independent of the bus system.
- virtual component interface protocol is used.
- the interface control and data fields are packed into bus flow control digits by the sending interface and then unpacked at the receiving interface unit.
- the main, high-bandwidth, interface to the bus uses the advanced virtual component interface. All features of the advanced virtual component interface are implemented, with the exception of those used to optimise the internal operation of an OCB.
- the bus interface converts data and control information from the virtual component interface protocol to the bus internal communication protocol.
- control bits and data are packed up into packets and sent to the destination interface unit, where they are unpacked and separated back into data and control.
- virtual component interface compliant interface units are utilised, it is appreciated that different interface units may be used instead (e.g. ARM AMBA compliant interfaces).
- Table II shows the fields within the data flow control digits that are used by the interconnection system according to an embdoiemnt of the present invention. All other information in the flow control digits is simply transported by the bus. The encoding and decoding is performed by the interface units. The interface units also insert the head and tail bits into the flow control digits, and insert the MOD ID in the correct bit fields. TABLE II Specific fields. Name Bit Comments Head FLOW CONTROL Set to ‘1’ to indicate first flow control DIGIT_WIDTH - 1 digit of packet. Tail FLOW CONTROL Set to‘1’ to indicate last flow control digit DIGIT_WIDTH - 2 of packet.
- Virtual component interface calls FLOW CONTROL this MOD ID. It is really an interface ID, DIGIT_WIDTH - 10 since a large functional unit could have multiple bus interfaces, in which case, it is necessary to distinguish between them.
- the advanced virtual component interface packet types are read request, write request and read response.
- a read request is a single flow control digit packet and all of the relevant virtual component interface control fields are packed into the flow control digit.
- a write request consists of two or more flow control digits.
- the first flow control digit contains virtual component interface control information (e.g. address).
- the subsequent flow control digits contain data and byte enables.
- a read response consists of one or more flow control digits.
- the first and subsequent flow control digits all contain data plus virtual component interface response fields (e.g. RSCRID, RTRDID and RPKTID).
- CMDVAL 1 — — — IT Handshake signal WDATA 128 127:0 IT Only for write requests. BE 16 143:128 IT Only for write requests.
- Peripheral virtual component interface burst-mode read and write transactions are not supported over the bus, as these cannot be efficiently implemented. For the reason, the peripheral virtual component interface EOP signal should be fixed at logic ‘1’. Any additional processing unit or extenal units can be attached to the bus, but the EOP signal should again be fixed at logic ‘1’. With this change, the new unit should work normally.
- the read request type is a single flow control digit packet carrying the 32-bit address of the data to be read.
- the read response is a single flow control digit response containing the requested 32 bits of data.
- the write request is a single flow control digit packet containing the 32-bit address of the location to be written, plus the 32 bits of data, and 4 bits of byte enable. The write response prevents a target responding to a write request in the same way that it would to a read request.
- the internal addressing mechanism of the bus is based on the assumption that all on-chip blocks in the system have a fixed 8-bit module ID.
- Virtual component interface specifies the use of a single global address space. Internally the bus delivers packets based on the module ID of each block in the system. The module ID is 8 bits wide. All global addresses will contain the 8 bits module ID, and the interface unit will simply extract the destination module ID from the address. The location of the module ID bits within the address is predetermined.
- the module IDs in each system are divided into groups. Each group may contain up to 16 modules.
- the T-switches in the system use the group ID to determine which output port to send each packet to.
- each group there may be up to sixteen on-chip blocks, each with a unique subID.
- the inclusion of only sixteen modules within each group does not restrict the bus topology.
- Within each linear bus section there may be more than one group, but modules from different groups may not interleave. There may be more than sixteen modules between T-switches.
- the bus system blocks operate at the same clock rate as synthesisable RTL blocks in any given process.
- the target clock rate is 400 MHz.
- latency of packets on the bus is one cycle at the sending interface unit, one cycle per bus block (nodes and repeaters) that data passes through, one or two additional cycle(s) at the node consume unit, one cycle at receiving interface unit and n ⁇ 1 cycles, where n is the packet length in flow control digits.
- the length of the packets is unlimited. However, consideration should be taken with excessively large packets, as they will utilises a greater amount of the bus resource. Long packets can be more efficient than a number of shorter ones, due to the overhead of having a header flow control digit.
- the interconnection system does not guarantee that requested data items would be returned to a module in the order in which they were requested.
- the block is responsible for re-ordering packets if the order matters. This is achieved using the advanced virtual component interface pktid field, which is used to tag and reorder outstanding transactions. It cannot be assumed that data will arrive at the on-chip block in the same order that it was requested from other blocks in the system. Where ordering is important, the receiving on-chip block must be able to re-order the packets. Failure to adhere to this rule is likely to result in system deadlock.
- the interconnection system according to the present invention offers considerable flexibility in the choice of interconnect topology. However, it is currently not advisable to have loops in the topology as these will introduce the possibility of deadlock.
- a further advantage of the interconnection system according to the present invention is that saturating the bus with packets will not cause it to fail. The packets will be delivered eventually, but the average latency will increase significantly. If the congestion is reduced to below the maximum throughput, then it will return to “normal operation”.
- the interface units may be incorporated into either the on-chip blocks or an area reserved for the bus, depending on the requirements of the overall design, for example there may be a lot of area free under the bus and so using this area rather than adding the functional block area would make more sense as it would reduce the overall chip area.
- the interconnection system forms the basis of a system-on-chip platform.
- the nodes contain the necessary “hooks” to handle distribution of the system clock reset signals.
- the routing of transactions and responses between initiator and target is performed by the interface blocks that connect to the interconnection system, and any intervening T-switch elements in the system. Addressing of blocks is hardwired and geographic, and the routing information is compiled into the interface and T-switch logic at chip integration time.
- the platform requires some modularity at the chip level as well as the block level on chips. Therefore, knowledge of what other chips or their innards they are connected to can not be hard-coded in the chips themselves, as this may vary on different line cards.
- FIG. 11 illustrates an example of a linear chip arrangement.
- the chips 1100 ( 0 ) to ( 2 ) etc sequentially so that any block in any chip knows that a transaction must be routed “up” or “down” the interconnection 1110 to reach its destination, as indicated in the chip ID-field of the physical address. This is exactly the same process as the block performs to route to another block on the same chip. In this case a 2-level decision is utilised. If in ‘present chip’ then route on Block ID, else route on Chip ID.
- FIG. 12 An alternative topology is shown in FIG. 12. It comprises a first bus lane 1201 and a second bus lane 1202 arranged in parallel.
- the first and second bus lane correspond to the interconnection system of the embodiment of the present invention.
- a plurality of multi threaded array processors (MTAPs) 1210 are connected across the two bus lanes 1201 , 1202 .
- An network input device 1220 , a collector device 1230 , a distributor device 1240 , an network output device 1250 are connected to the second bus lane 1202 and a table lookup engine 1260 is connected to the first bus lane 1201 . Details of operation of the devices connected to the bus lanes is not provided here.
- the first bus lane 1201 (256 bits wide, for example) is dedicated to fast path packet data.
- a second bus lane 1202 (128 bits wide, for example) is used for general non-packet data, such as table lookups, instruction fetching, external memory access, etc.
- Blocks accessing bus lanes 1201 , 1202 use AVCI protocol.
- An additional bus lane may be used (not shown here for reading and writing block configuration and status registers. Blocks accessing this lane use the PVCI protocol.
- Blocks can have multiple bus interfaces, for example. Lane widths can be configured to meet the bandwidth requirements of the system.
- the interconnection system of the present invention uses point-to-point connections between interfaces, and uses distributed arbitration, it is possible to have several pairs of functional blocks communicating simultaneously without any contention or interference. Traffic between blocks can only interfere if that traffic travels along a shared bus segment in the same direction. This situation can be avoided by choosing a suitable layout. Thus, bus contention can be avoided in the fast path packet flow. This is important to achieve predictable and reliable performance, and to avoid overprovisioning the interconnection.
- This topology uses a T-junction 1305 to exploit the fact that traffic going in opposite directions on the same bus segment 1300 is non-interfering. Using the T-junction block 1305 may ease the design of the bus topology to account for layout and floor planning constraints.
- the interconnection system of the present invention preferrably supports advanced virtual component interface transactions, which are simply variable size messages as defined in the virtual component interface standard, sent from an initiator interface to a target interface, possibly followed by a response at a later time. Because the response may be delayed, this is called a split transaction in the virtual component interface system.
- the network processing system architecture defines two higher levels of abstraction in the inter-block communication protocol, the chunk and the abstract datagram (frequently simply called a datagram).
- a chunk is a logical entity that represents a fairly small amount of data to be transferred from one block to another.
- An abstract datagram is a logical entity that represents the natural unit of data for the application.
- Chunks are somewhat analogous to CSIX C frames, and are used for similar purposes, that is, to have a convenient, small unit of data transfer. Chunks have a defined maximum size, typically about 512 bytes, while datagrams can be much larger, typically up to 9K bytes; the exact size limits are configurable.
- CSIX C frames CSIX C frames
- Chunks have a defined maximum size, typically about 512 bytes, while datagrams can be much larger, typically up to 9K bytes; the exact size limits are configurable.
- the system addressing scheme according to the embodiment of the present invention will now be described in more detail.
- the system according to an embodiment of the present invention may span a subsystem that is implemented in more than one chip.
- the routing of transactions and responses between initiators and targets is performed by the interface blocks that connect to the interconnection itself, and the intervening T-switch elements in the interconnection. Addressing according to an embodiment of the present invention of the blocks is hardwired and geographic, and the routing information is compiled into the interface units, T-switch and node elements logic at chip integration.
- the interface ID occupies the upper part of the physical 64 bit address, the lower bits being the offset within the block. Additional physical bits are reserved for the chip ID to support multi-chip expanses.
- the platform according to the embodiment of the present invention requires some modularity at the chip level as well as the block level on chips, knowledge of what other chips or their innards they are connected to can not be hard-coded, as this may vary on different line cards. This prevents the use of the same hard-wired bus routing information scheme as exists in the interface units for transactions within one chip.
- FIGS. 14 and 15 An example of a traffic handler subsystem in which the packet queue memory is implemented around two memory hub chips is shown in FIGS. 14 and 15.
- the four chips have four connections to other chips. This results in possible ambiguities about the route that a transaction takes from one chip to the next. Therefore, it is necessary to control the flow of transactions by configuring the hardware and software appropriately, but without having to include: programmable routing functions in the interconnection.
- chip ID for chip 1401 may be 4,2, chip 1403 may be 5,1, chip 1404 may be 5,3 and chip 1402 may be 6,2.
- Simple, hardwired rules are applied on about how to route the next hop of a transaction destined for another chip.
- locating the chips on a virtual “grid” such that the local rules produce the transaction flows desired.
- the grid can be “distorted” by leaving gaps or dislocations to achieve the desired effect.
- Each chip has complete knowledge of itself, including how many external ports it has and their assignments to N,S,E,W compass points. This knowledge is wired into the inetrface units and T-switches. A chip has no knowledge at all of what other chips are connected to it, or their x,y coordinates.
- a transaction is routed out on the interface that forms the least angle with its relative grid location.
Abstract
An interconnection system (110) interconnects a plurality of reusable functional units (105 a), (105 b), (105 c). The system (110) comprises a plurality of nodes (135), (140), (145), (150), (155), (160) each node communicating with a functional unit. A plurality of data packets are transported between the functional units. Each data packet has routing information associated therewith to enable a node to direct the data packet via the interconnection system.
Description
- The present invention relates to an interconnection network. In particular, but not exclusively, it relates to an intra chip interconnection network.
- A typical bus system for interconnecting a plurality of functional units (or processing units) consists of either a set of wires with tri-state drivers, or two uni-directional data-paths incorporating multiplexers to get data onto the bus. Access to the bus is controlled by an arbitration unit, which accepts requests to use the bus, and grants one functional unit access to the bus at any one time. The arbiter may be pipelined, and the bus itself may be pipelined in order to achieve a higher clock rate. In order to route data along the bus, the system may comprise a plurality of routers which typically comprise a look up table. The data is then compared with the entries within the routing look up table in order to route the data onto the bus to its correct destination.
- However, such routing schemes can not be realised on a chip, since the complexity and the size of its components make this infeasible. This has been overcome in existing on-chip bus systems by using a different scheme in which data is broadcasted, that is, transferring the data from one functional units to a plurality of other functional units simultaneously. This avoids the need for routing tables. However, broadcasting data to all functional units on the chip consumes considerable power and is, thus, inefficient. Also it is becoming increasingly difficult to transfer data over relatively long distances in one clock cycle.
- Furthermore, in a typical bus system, since every request to use the bus (transactor) must connect to the central arbiter, this limits the scalabilty of the system. As bigger systems are built, the functional units are further from the arbiter, so latency increases and the number of concurrent requests that may be handled by a single arbiter is limited. Therefore, in such central arbiter based bus systems, the length of the bus and the number of transactors are normally fixed at the outset. Therefore, it would not be possible to lengthen the bus at a later stage to meet varying system requirements.
- Another form of interconnection is a direct interconnection network. These types of networks typically comprise a plurality of nodes, each of which is a discrete router chip. Each node (router) may connect to a processor and to a plurality of other nodes to form a network topology.
- In the past, it has been infeasible to use this network-based approach as a replacement for on-chip buses becauses the individual nodes are too big to be implemented on a chip.
- Many existing buses are created to work with a specific protocol. Many of the customised wires relate to specific features of that protocol. Conversely many protocols are based around a specific bus implementation, for example having specific data fileds to aid the arbiter in some way.
- The object of the present is to provide an interconnection network as an on-chip bus system. This achieved by routing data on the bus as opposed to broadcasting data. The routing of the data being achieved by a simple addressing scheme in which each transaction has routing information associated therewith, for example a geographical address, which enables the nodes within the interconnection network to route the transaction to its correct destination.
- In this way, the routing information contains information on the direction to send the data packet. This routing information is not merely an address of a destination but provides directional information, for example x,y coordinates of a grid to give direction. Thus the nodes do not need routing table(s) or global signals to determine the direction since all the information the node needs is contains in the routing information of the data packets. This enables the circuitry of the node and the interconnection system to be simplified making integrated of the system onto a chip feasible.
- If each functional unit is connected to a node, and all nodes are connected together, then a pipeline connection will exist between each pair of nodes in the system. The number of intervening nodes will govern the number of pipeline stages. If there is pair of joined nodes where the distance between them is too great to transmit data within a single clock cycle, a repeater block can be inserted between the nodes. This block registers the data, while maintaining the same protocol as the other bus blocks. The inclusion of the repeater blocks allows interconnection of arbitrary length to be created.
- The interconnection system according the present invention can be utilised in an intra-chip interconnection network. Data transfers are all packetized, and the packets may be of any length that is a multiple of the data-path width. The nodes of the bus used to create the interconnection network (nodes and T-switches) all have registers on the data-path(s).
- The main advantage of the present invention is that it is inherently re-usable. The implementer need only instantiate enough functional blocks to form an interconnection of the correct length, with the right number of interfaces, and with enough repeater blocks to achieve the desired clock rate.
- The interconnection system in accordance with the present invention employs distributed arbitration. The arbitration capability grows as more blocks are added. Therefore, if the bus needs to be lengthened, it is a simple matter of instantiating more nodes and possibly repeaters. Since each module manages its own arbitration within itself, the overall arbitration capability of the interconnect increases. This makes the bus system of the present invention more scalable (in length and overall bandwidth) than other conventional bus systems.
- The arbitration adopted by the system of the present invention is truly distributed and ‘localised’. This has been simplified such that there is no polling to see if the downstream route is free as in conventional distributed systems, instead this information is initiated by the ‘blocked node’ and pipelined back up the interconnection (bus) by upstream nodes.
- The interconnection in accordance with the present invention is efficient in terms of power consumption. Since packets are routed, rather than broadcast, only the wires between the source and destination node are toggled. The remaining bus drivers are clock-gated. Hence the system of the present invention consumes less power.
- Furthermore, every node on the bus has a unique address associated with it; an interface address. A field in the packet is reserved to hold a destination interface address. Each node on the bus will interrogate this field of an incoming packet; if it matches its interface address it will route the packet off the interconnection (or bus), if it does not match it will route the packet down the bus. The addressing scheme could be extended to support “wildcards” for broadcast messages; if a subset of the address matches the interface address then the packet is routed off the bus and passed on down the bus, otherwise it is just sent on down the bus.
- For packets coming on to the bus, each interface unit interrogates the destination interface address of the packet. This is used to decide which direction a packet arriving on the bus from an attached unit is to be routed. In the case of a linear bus, this could be a simple comparison: if the destination address is greater than the interface address of the source of the data then the packets routed “up” the bus, otherwise the packet is routed “down” the bus. This could be extended to each interface unit such that each node maps destination addresses, or ranges of addresses, to directions on the bus.
- Preferably, the interface unit sets a binary lane signal based on the result of this comparison. In this way functionality is split between the node and interface unit. All “preparation” of the data to be transported (including protocol requirements) is carried out in the interface unit. This allows greater flexibility as the node is unchanging irrespective of the type of data to be transported, allowing the node to be re-used in different circuits. More preferably the node directs the packet off the interconnection system to a functional unit.
- More preferably, data destined for the interconnection, the interface unit can carry out the following functions: take the packet from the functional unit, ensure a correct destination module ID, head and tail bit; compare the destination module ID to the local module ID and sets a binary lane signal based on the result of this comparison; pack the module ID, data and any high level (non bus) control signals into a flit; implement any protocol change necessary; and pass the lane signal and flit to the node using the inject protocol.
- A T-junction or switch behaves in a similar way; the decision here is simply whether to route the packet down one branch or the other. This would typically be done for ranges of addresses; if the address is larger than some predefined value then the packets are routed left, otherwise they are routed right. However, more complex routing schemes could be implemented if required.
- The addressing scheme can be extended to support inter-chip communication. In this case a field in the address is used to define a target chip address with, for example, 0 in this field representing a local address of the chip. When a packet arrives at the chip this field will be compared with the pre-programmed address of the chip. If they match then the field is set to zero and the local routing operates as above. If they do not match, then the packet is routed along the bus to the appropriate inter-chip interface in order to be routed towards its final destination. This scheme could be extended to allow a hierarchical addressing scheme to manage routing across systems, sub-systems, boards, groups of chips, as well as individual chips.
- The system according to the present invention is not suitable for all bus-type applications. The source and destination transactors are decoupled, since there is no central arbitation point. The advantage of the approach of the present invention is that long buses (networks) can be constructed, with very high aggregrate bandwidth.
- The system of the present invention is protocol agnostic. The interconnection of the present invention merely transports data packets. Interface unit in accordance with the present invention manage all protocol specific features. This means that it easy to migrate to a new protocol, since only the interface units need to be re-designed.
- The present invention also provides flexible topology and length.
- The repeater blocks of the present invention allow very high clock rates in that the overall modular structure of the interconnection prevents the clock rate being limited by long wires. This simplifies the synthesis and layout. The repeater blocks not only pipeline the data as it goes downstream but they implement a flow control protocol; pipeline blockage information up the interconnection (or bus) (rather than a blocking signal being distributed globally). When blocked, a feature of this mechanism is that data compression (inherent buffering) is achieved on the bus at least double the latency figure, i.e. if latency through the repeater is one cycle then two data control flow digits (flits: the basic unit of data transfer over the interconnection of the present invention. It includes n bytes of data, as well as some side-band control signals. The flow control digit size equals the size of the bus data-path) will concatenate when it is blocked. This means that the scope of any blocking is minimised and thus reducing any queuing requirement in a functional block.
- The flow of data flow control digits is managed by a flow control protocol, in conjunction with double-buffering in a store block (and repeater unit) as described previously.
- The components of the interconnection of the present invention manage the transportation of packets. Customised interface units handle protocol specific features. These typically involve packing and unpacking of control information and data, and address translation. A customised interface unit can be created to match any specific concurrency.
- Many packets can be travelling along separate segments of the interconnection of the present invention simultaneously. This allows the achievable bandwidth to be much higher than the raw bandwidth of the wires (width of bus, multiplied by clock rate). If there are, for example, four adjacent on-chip blocks A, B, C and D, then A and B can communicate at the same time that C and D communicate. In this case the achievable bandwidth is twice that of broadcast-based bus.
- Packets are injected (gated) onto the interconnection at each node, so that each node is allocated a certain amount of the overall bandwidth allocation (e.g. by being able to send, say 10 flow control digits within every 100 cycles). this distributed scheme controls the overall bandwidth allocation.
- It is possible to keep forcing packets onto the interconnection of the present invention until it saturates. All packets will eventually be delivered. This means the interconnection can be used as a buffer with an in-built flow control mechanism.
- FIG. 1 is a block schematic diagram of the system incorporating the interconnection system according to an embodiment of the present invention;
- FIG. 2 is a block schematic diagram illustrating the initiator and target of a virtual component interface system of FIG. 1,
- FIG. 3 is a block schematic diagram of the node of the interconnection system shown in FIG. 1;
- FIG. 4 is a block schematic diagram of connection over the interconnection according to the present invention between virtual components of the system shown in FIG. 1;
- FIG. 5a is a diagram of the typical structure of the T-switch of FIG. 1;
- FIG. 5b is a diagram showing the internal connection of the T-switch of FIG. 5a;
- FIG. 6 illustrates the Module ID (interface ID) encoding of the system of FIG. 1;
- FIG. 7 illustrates handshaking signals in the interconnection system according to an embodiment of the present invention;
- FIG. 8 illustrates the blocking behaviour of the interconnection system of an embodiment of the present invention when occup[1:0]=01;
- FIG. 9 illustrates blocking for two cycles of the interconnection system according to an embodiment of the present invention:
- FIG. 10 illustrates virtual component interface handshake according an embodiment of the present invention;
- FIG. 11 illustrates a linear chip arrangement of the system according to an embodiment of the present invention;
- FIG. 12 is a schematic block diagram of the interconnection system of the present invention illustrating an alternative topogly;
- FIG. 13 is a schematic block diagram of the interconnection system of the present invention illustrating a further alternative topogly;
- FIG. 14 illustrates an example of a traffic handling subsystem according to an embodiment of the present invention;
- FIG. 15 illustrates a system for locating chips on a virtual grid according to a method of a preferred embodiment of the present invention; and
- FIG. 16 illustrates routing a transaction according to the method of a preferred embodiment of the present invention.
- The basic mechanism for communicating data and control information between functional blocks is that blocks exchange messages using the
interconnection system 100 according to the present invention. The bus system can be extended to connect blocks in a multi chip system, and the same mechanism works for blocks within a chip or blocks on different chips. - An example of a
system 100 incorporating theinterconnection system 110 according to an embodiment of the present invention, as shown in FIG. 1, comprises a plurality of reusable on-chip functional blocks or virtual component blocks 105 a, 105 b and 105 c. These functional units interface to the interconnection and can be fixed. They can be re-used at various levels of abstraction (eg. RTL, gate level, GDSII layout data) in different circuit design. The topology can be fixed once the size, aspect ratio and the location of the I/O's to the interconnection are known. Each on-chipfunctional unit interconnection system 110 via its interface unit. The interface unit handles address decoding and protocol translation. The on-chipfunctional block 105 a, for example, is connected to theinterconnection system 110 via an associated virtualcomponent interface initiator 115 a and peripheral virtualcomponent interface initiator 120 a. - The on-chip
functional block 105 b, for example, is connected to theinterconnection system 110 via an associated virtualcomponent interface target 125 b and peripheral virtualcomponent interface target 130 b. The on-chipfunctional block 105 c, for example, is connected to theinterconnection system 110 via an associated virtualcomponent interface initiator 115 c and peripheral virtualcomponent interface target 130 c. The associated initiators and targets for each on-chip functional block shown in FIG. 1 are purely illustrative and may vary depending on the associated block requirements. A functional block may have a number of connections to the interconnection system. Each connection has an advanced virtual component interface (extensions forming a superset of basic virtual component interface. This is the protocol used for the main data interfaces in the system of the present invention) or peripheral virtual component interface interface (low bandwidth interface allowing atomic operations, mainly used in the present invention for control register access). - One currently accepted protocol for connecting such on-chip functional units as shown in FIG. 1 to a system interconnection according to the embodiment of the present invention is virtual component interface. Virtual component interface is an OCB standard interface to communicate between a bus and/or virtual component, which is independent of any specific bus or virtual component protocol.
- There are three types of virtual component interfaces, peripheral
virtual component interface - The advanced virtual component interface unit comprises a target and initiator. The target and initiator are virtual components that send request packets and receive response packets. The initiator is the agent that initiates transactions, for example, DMA (or EPU on F150).
- As shown in FIG. 2, an interface unit that initiates a read or write transaction is called an initiator210 (issues a request 220), while an interface that receives the transaction is called the target 230 (responds to a request 240). This is the standard virtual component terminology.
- Communication between each no-chip
functional block initiator unique node - The
interconnection system 110 according an embodiment of the present invention comprises threeseparate buses component interface bus 165, and a 256-bit advanced virtualcomponent interface bus 175. Although three separate buses are illustrated here, it is appreciated that the interconnection system of the present invention may incorporate any number of separate buses. - At regular intervals along the bus length a
repeater unit 180 may be inserted for all thebuses buses buses repeater units 180. Repeater units would of course only be required when the timing contraints between two nodes cannot be met due to the length of wire of the interconnection. - For complex topologies, T-switches (3-way connectors or the like)185 can be provided. The interconnection system of the present invention can be used in any topology but care should be taken when the topology contains loops as deadlock may result.
- Data is transferred on the interconnection network of the present invention in packets. The packets may be of any length that is a multiple of the data-path width. The
nodes - Each interface unit is connected to a node within the interconnection system itself, and therefore to one particular lane of the bus. Connections may be of initiator or target type, but not both—following from the conventions of virtual component interface. In practise every block is likely to have a peripheral virtual component interface target interface for configuration and control.
- The bus components according to the embodiment of the present invention use distributed arbitration, where each block in the bus system manages access to its own resources.
- A
node 135 according to the embodiment of the present invention is illustrated in FIG. 3. Eachnode Node 135 is connected to thebus 175 of FIG. 1. Each node comprises a first andsecond input store first input store 315 has an input connected to afirst bus lane 305. Thesecond input store 320 has an input connected to asecond bus lane 310. The output of thefirst input store 315 is connected to athird bus lane 306 and the output of thesecond input store 320 is connected to afourth bus lane 311. Each node further comprises an injectcontrol unit 335 and a consumecontrol unit 325. The node may not require the consume arbitration, for example the node may have an output for each uni-directional lane but the consume handshaking retained. The input of the injectcontrol unit 335 is connected to the output of an interface unit of the respective functional unit for that node. The outputs of the injectcontrol unit 335 are connected to afifth bus lane 307 arid sixth bus lane 312. The input of the consumecontrol unit 325 is connected to the output of amultiplexer 321. The inputs of themultiplexer 321 are connected to thefourth bus lane 311 and thethird bus lane 306. The output of the consumecontrol unit 325 is connected to abus 330 which is connected to the interface unit of the respective functional unit for that node. Thefifth bus lane 307 and thethird bus lane 306 are connected to the inputs of amultiplexer 308. The output of themultiplexer 308 is connected to thefirst bus lane 305. Thefourth bus lane 311 and the sixth bus lane 312 are connected to the inputs of amultiplexer 313. The output of themultiplexer 313 is connected to thesecond bus lane 310. - The nodes are the connection points where data leaves or enters the bus. It also forms part of the transport medium. The node forms part of the bus lane which it connects to, including both directions of data path. The node conveys data on the lane to which i connects, with one cycle of latency when not blocked. It also allows the connecting functional block to inject and consume data in either direction, via its interface unit. Arbitration of injected or passing data is performed entirely within the node. Internally,
bus 175 consists of afirst lane 305 and asecond lane 310. The first andseond lanes node 135. As illustrated in FIG. 3, the direction of data flow of thefirst lane 305 is in the opposite direction to that of thesecond lane 310. Eachlane second lanes - The
node 135 is capable of concurrently receiving and injecting data on the same bus lane. At the same time it is possible to pass data through on the other lane. Eachuni-directional lane separate stream streams point 321 where data leaves thenode 135 into the on-chip module 105 a (not shown here) via theinterface unit chip block 105 a onto thebus lanes interconnection 110. - This is an example of local arbitration, where competition for resources is resolved in the
block 105 a where those resources reside. In this case, it is competition for access to bus lanes, and for access to the single lane coming off the bus. This approach of using local arbitration is used throughout the interconnection system, and is key to its scalability. An alternative would be that both output buses come from the node to the functional unit and then the arbitration mux would not be needed. - Each lane can independently block or pass data through. Data can be consumed from one lane at a time, and injected on one lane at the same time. Concurrent inject and consume on the same lane is also permitted. Which lane each packet is injected on is determined within the interface unit.
- Each input store (or register)315 and 320 registers the data as it passes from node to node. Each
store - The
output interface unit bus lanes lane 330 that feeds into the on-chipfunctional unit 105 a via the interface unit which is connected to thenode 135. Theoutput interface unit input interface unit 335. Theinput interface unit 335 performs de-multiplexing of packets onto one of thebus lanes - A plurality of
repeater units 180 are provided at intervals along the length of theinterconnection 110. Eachrepeater unit 180 is used to introduce extra registers on the data path. It adds an extra cycle of latency, but is only used where there is a difficulty meeting timing constraints. Eachrepeater unit 180 comprises a store similar to the store unit of the nodes. The store unit merely passes data onwards, and implements blocking behaviour. There is no switching carried out in the repeater unit. The repeater block allows for more freedom in chip layout. For example, it allows long length of wires between nodes or where a block has a single node to connect to a single lane, repeaters may be inserted into the other lanes in order to produce uniform timing characteristics over all lanes. There may be more than one repeater between two nodes. - The system according to the embodiment of the present invention is protocol agnostic, that is to say, the data-transport blocks such as the
nodes repeater units 180 and T-switch 185 simply route data packets from a source interface to a destination interface. Each packet will contain control information and data. The packing and unpacking of this information is performed in theinterface units - A large on-chip block may have several interfaces to the same bus.
- The target and
initiator - The interface unit target comprises a send path which receives incoming read and write requests. The target converts from bus communication protocol to advanced virtual component interface protocol. It blocks data on the bus if it cannot be accepted across the virtual component interface. The target also comprises a response path which carries read (and for verification purposes, write) requests. It converts advanced virtual component interface communication protocol to bus protocol and blocks data at the advanced virtual component interface if it cannot be sent onto the bus.
- The other type of interface unit utilised in the embodiment of the present invention is a peripheral virtual component unit. The main differences between the peripheral virtual component interface and the advanced virtual component interface are the data interface of the peripheral virtual component interface is potentially narrower (up to 4 bytes) than the advanced virtual component interface and the peripheral virtual component interface is not split transaction.
- The peripheral virtual component interface units perform conversion between the peripheral virtual component interface and bus protocols in the initiator, and from the bus protocol to peripheral virtual component interface protocol in the target. The protocol is an asynchronous handshake on the peripheral virtual component interface side.
- The interface unit initiator comprises a send path. It generates destination module ID and the transport lane number from memory address. The initiator blocks the data at the peripheral virtual component interface when it cannot be sent onto the bus. The initiator also comprises a response path. This path receives previously requested data, converting from bus communication protocol to the peripheral virtual component interface protocol. It also blocks data on the bus if the on-chip block (virtual component block) is unable to receive it.
- The peripheral virtual component interface unit target comprises a send path which receives incoming read and write requests. It blocks data on the bus if it cannot be accepted across the virtual component interface. The target also comprises a response path which carries read (and for verification purposes, write) requests. It converts peripheral virtual component interface communication protocol to bus protocol and blocks data at the virtual component interface if it cannot be sent onto the bus.
- The peripheral virtual component interface initiator may comprise a combined initiator and target. This is so that the debug registers (for example) of an initiator can be read from.
- With reference to FIG. 4, the virtual component (on-chip) blocks can be connected to each other over the interconnection system according to the present invention. A first virtual component (on-chip) block425 is connected point to point to an
interface unit target 430. Theinterface unit target 430 presents a virtualcomponent initiator interface 440 to thevirtual component target 445 of on-chip block 425. Theinterface unit target 430 uses a busprotocol conversion unit 448 to interface to thebus interconnect 450. Theinterface unit initiator 460 presents atarget interface 470 to theinitiator 457 of the second on-chip block 455 and, again, uses a busprotocol conversion unit 468 on the other side. - The T-
switch 185 of FIG. 1 is a block that joins 3 nodes, allowing more complex interconnects than simple linear ones. At each input port the interface ID of each packet is decoded and translated into a single bit, which represents the two possible outgoing ports. A hardwired table inside the T-Switch performs this decoding. There is one such table for each input port on the T-Switch. Arbitration takes place for the output ports if there is a conflict. The winner may send the current packet, but must yield when the packet has been sent. FIGS. 5a and 5 b show an example of the structure of a T-switch. - The T-switch comprises three sets of input/
output ports unidirectional bus lanes junction bus lanes incoming bus 520 coming into aninput port 515 can be output viaoutput port - Packets do not change lanes at any point on the bus, so the T-switch can be viewed as a set of n 3-way switches, where n is the number of uni-directional bus lanes. The T-
switch 185 comprises a lane selection unit. The lane selection unit takes in module ID of incoming packets and produces a 1-bit result corresponding to the two possible output ports on the switch. The T-switch also comprises a store block on each input lane. Each store block stores data flow control digits and allows them to block in place if the output port is temporarily unable to receive. It also performs a block on header function, which allows switching to occur at the packet level (rather than the flow control digit level). The T-switch also includes an arbiter for arbitration between requests to use output ports. - During initialisation, the interconnection system according to the embodiment of the present invention powers up into a usable state. Routing information is hardcoded into the bus components. A destination module interface ID (mod ID) for example as illustrated in FIG. 6 is all that is required to route a packet to another node. In order for that node to return a response packet, it must have been sent the module interface ID of the sender.
- There may be more than one interconnection in a processing system. On each bus, every interface (which includes an inject and consume port) has a unique ID. These ID's are hard-coded at silicon compile-time.
- Units attached to the bus (on-chip blocks) are free to start communicating straight after reset. The interface unit will hold off communications (by not acknowledging them) until it is ready to begin operation.
- The interconnection system according to the present invention has an internal protocol that is used throughout. At the interfaces to the on-chip blocks this may be converted to some other protocol, for example virtual component interface as described above. The internal protocol will be referred to as the bus protocol This bus protocol allows single cycle latency for packets travelling along the bus when there is no contention for resources, and to allow packets to block in place when contention occurs.
- The bus protocol is used for all internal (non interface/virtual component interface) data transfers. It consists of five signals:
occup 705,head 710,tail 715,data 720 and valid 725 between a sender 735 and areceiver 730. These are shown in FIG. 7. - The packets consist of one or more flow control digits. On each cycle that the sender asserts the valid signal, the receiver must accept the data on the next positive clock edge.
- The
receiver 730 informs the sender 735 about its current status using theoccup signal 705. This is a two-bit wide signal.TABLE I Occup signal values and their meaning. Occup [1:0] Meaning 00 Receiver is empty - can send data. 01 Receiver has one flow control digit - if sending a flow control digit on this cycle, don't send a flow control digit on the next cycle. 10 The Receiver is full. Don't send any flow control digits until Occup decreases. 11 Unused. - The
occup signal 705 tells the sender 735 if and when it is able to send data. When the sender 735 is allowed to transmit a data flow control digit, it is qualified with avalid signal 725. - The first flow control digit in each packet is marked by head=‘1’. The last flow control digit is marked by tail=‘1’. A single flow control digit packet has signals head=tail=valid=‘1’. Each node and T-Switch use these signals to perform switching at the packet level.
- FIG. 8 shows an example of blocking behaviour on the interconnect system according to an embodiment of the present invention. The occup signal is set to ‘1’, meaning ‘if sending a flow control digit this cycle, don't send one on the next cycle’.
- FIG. 9 shows an example of the blocking mechanism more completely. The occup signal is set to 01 (binary), then to 10 (binary). The sender can resume transmitting flow control digits when the occup signal is set back to 01—at that point it is not currently sending a flow control digit, so it is able to send one on the next cycle.
- The protocol at the boundary between the node and the interface unit is different from that just described, and is similar to that used by the virtual component interface. At the sending and receiving interfaces, there is a val and an ack signal. When val=ack=1, a flow control digit is exchanged for the inject protocol. The consume (bus output) protocol is different to the inject protocol but is the minimum logic to allow registered outputs (and thus simplifies synthesis and integration into a system on chip). The consume protocol is defined as: on the rising clock edge, data is blocked on the next clock edge if CON_BLOCK=1; on the rising clock edge, data is unblocked on the next clock edge if CON_BLOCK=0; CON_BLOCK is the flow control (blocking) signal from the functional unit.
- Of course the protocols at this interface can be varied and not effect the overall operation of the bus.
- The difference between this and virtual component interface is that the ack signal is high by default, and is only asserted low on a cycle when data cannot be received. Without this restriction, the node would need additional banks of registers.
- The bus protocol allows exchange packets consisting of one or more flow control digits. Eight bits in the upper part of the first packet carry the destination module ID, and are used by the bus system to deliver the packet. The top 2 bits are also used for internal bus purposes. In all other bit fields, the packing of the flow control digits is independent of the bus system.
- At each interface unit, virtual component interface protocol is used. The interface control and data fields are packed into bus flow control digits by the sending interface and then unpacked at the receiving interface unit. The main, high-bandwidth, interface to the bus uses the advanced virtual component interface. All features of the advanced virtual component interface are implemented, with the exception of those used to optimise the internal operation of an OCB.
- The virtual component interface protocol uses an asynchronous handshake as shown in FIG. 8. Data is valid when VAL=ACK=1. The bus interface converts data and control information from the virtual component interface protocol to the bus internal communication protocol.
- The bus system does not distinguish between control information and data. Instead, the control bits and data are packed up into packets and sent to the destination interface unit, where they are unpacked and separated back into data and control.
- Although in the preferred embodiment, virtual component interface compliant interface units are utilised, it is appreciated that different interface units may be used instead (e.g. ARM AMBA compliant interfaces).
- Table II shows the fields within the data flow control digits that are used by the interconnection system according to an embdoiemnt of the present invention. All other information in the flow control digits is simply transported by the bus. The encoding and decoding is performed by the interface units. The interface units also insert the head and tail bits into the flow control digits, and insert the MOD ID in the correct bit fields.
TABLE II Specific fields. Name Bit Comments Head FLOW CONTROL Set to ‘1’ to indicate first flow control DIGIT_WIDTH - 1 digit of packet. Tail FLOW CONTROL Set to‘1’ to indicate last flow control digit DIGIT_WIDTH - 2 of packet. Mod FLOW CONTROL ID of interface to which packet is to be ID DIGIT_WIDTH - 3: sent. Virtual component interface calls FLOW CONTROL this MOD ID. It is really an interface ID, DIGIT_WIDTH - 10 since a large functional unit could have multiple bus interfaces, in which case, it is necessary to distinguish between them. - The advanced virtual component interface packet types are read request, write request and read response. A read request is a single flow control digit packet and all of the relevant virtual component interface control fields are packed into the flow control digit. A write request consists of two or more flow control digits. The first flow control digit contains virtual component interface control information (e.g. address). The subsequent flow control digits contain data and byte enables. A read response consists of one or more flow control digits. The first and subsequent flow control digits all contain data plus virtual component interface response fields (e.g. RSCRID, RTRDID and RPKTID).
- An example mapping of the advanced virtual component interface onto packets is now described. The example is for a bus with 128-bit wide data paths. It should be noted that the nodes extract the destination module ID from bits159:152 in the first flow control digit of each packet. In the case of read response packets this corresponds with the virtual component interface RSCRID field.
TABLE III Possible Virtual Component Interface fields for 128 bit wide bus Header AVCI/ Flow Read BVCI control Res- Signal digit ponses Di- Name WIDTH Bits Only? Only? rection Comments CLOCK 1 — — — IA RESETN 1 — — — IA CMDACK 1 — — — TI Handshake signal. CMDVAL 1 — — — IT Handshake signal. WDATA 128 127:0 IT Only for write requests. BE 16 143:128 IT Only for write requests. AD- 64 63:0 {haeck over (o)} — IT DRESS CFIXED 1 64 {haeck over (o)} IT CLEN 8 72:65 {haeck over (o)} IT ***needs update*** CMD 2 75:74 {haeck over (o)} IT CONTIG 1 76 {haeck over (o)} IT EOP 1 — — — IT Handshake signal. CONST 1 77 {haeck over (o)} IT PLEN 9 86:78 {haeck over (o)} IT WRAP 1 87 {haeck over (o)} IT RSPACK 1 — — — IT Handshake signal. RSPVAL 1 — — — TI Handshake signal. RDATA 128 127:0 {haeck over (o)} TI Only for read responses. REOP 1 — — — TI Handshake signal. RERROR 2 143:142 {haeck over (o)} TI Only for read responses. DEFD 1 88 {haeck over (o)} IT WRPLEN 5 93:89 {haeck over (o)} IT RFLAG 4 141:138 {haeck over (o)} TI Only for read responses. SCRID 8 151:144 {haeck over (o)} IT TRDID 2 95:94 {haeck over (o)} IT PKTID 8 103:96 {haeck over (o)} IT RSCRID 8 159:152 {haeck over (o)} TI Only for read responses. RTRDID 2 137:136 {haeck over (o)} TI Only for read responses. RPKTID 8 135:128 {haeck over (o)} TI Only for read responses. - Peripheral virtual component interface burst-mode read and write transactions are not supported over the bus, as these cannot be efficiently implemented. For the reason, the peripheral virtual component interface EOP signal should be fixed at logic ‘1’. Any additional processing unit or extenal units can be attached to the bus, but the EOP signal should again be fixed at logic ‘1’. With this change, the new unit should work normally.
- The read request type is a single flow control digit packet carrying the 32-bit address of the data to be read. The read response is a single flow control digit response containing the requested 32 bits of data. The write request is a single flow control digit packet containing the 32-bit address of the location to be written, plus the 32 bits of data, and 4 bits of byte enable. The write response prevents a target responding to a write request in the same way that it would to a read request.
- With all of the additional signals, 32 bit (data) peripheral virtual component interface occupies 69 bits on the bus.
TABLE IV PVCI Signal Bit Read Read Write Name Fields Request Response Request Comments CLOCK — — — — System signal. RESETN — — — — System signal. VAL — — — — Handshake signal. ACK — — — — Handshake signal. EOP — — — — Handshake signal. ADDRESS 63:32 {haeck over (o)} {haeck over (o)} RD 100 {haeck over (o)} {haeck over (o)} BE 67:64 {haeck over (o)} WDATA 31:0 {haeck over (o)} RDATA 31:0 {haeck over (o)} RERROR 68 {haeck over (o)} - The internal addressing mechanism of the bus is based on the assumption that all on-chip blocks in the system have a fixed 8-bit module ID.
- Virtual component interface specifies the use of a single global address space. Internally the bus delivers packets based on the module ID of each block in the system. The module ID is 8 bits wide. All global addresses will contain the 8 bits module ID, and the interface unit will simply extract the destination module ID from the address. The location of the module ID bits within the address is predetermined.
- The module IDs in each system are divided into groups. Each group may contain up to 16 modules. The T-switches in the system use the group ID to determine which output port to send each packet to.
- Within each group, there may be up to sixteen on-chip blocks, each with a unique subID. The inclusion of only sixteen modules within each group does not restrict the bus topology. Within each linear bus section, there may be more than one group, but modules from different groups may not interleave. There may be more than sixteen modules between T-switches. The only purpose of using a group ID and sub ID is to simplify the routing tables inside the T-switch(es). If there are no T-switches being used, the numbering of modules can be arbitrary, If a linear bus topology is used and the interfaces are numbered sequentially, this may simplify lane number generation, as a comparator can be used instead of a table. However, a table may still turn out to be smaller after logic minimisation. Two interfaces on different buses can have the same mod ID (=interface ID).
- An example to reduce erroneous traffic on the interconnection according to the embodiment of the present invention is described here. When packets that do not have a legal mod ID are presented to the interface unit, it will acknowledge them, but will also generate an error in the rerror virtual component interface field. The packet will not be sent onto the bus. The interface unit will “absorb” it and destroy it.
- In the preferred embodiment the bus system blocks operate at the same clock rate as synthesisable RTL blocks in any given process. For a 0.13 μm 40 G processor, the target clock rate is 400 MHz. There will be three separate buses. Each bus cpmprises a separate parameterised component. There will be a 64-bit wide peripheral virtual component interface bus connecting to all functional units on the chip. There will be two advanced virtual component interface buses, one with a 128-bit data-path (raw bandwidth 51.2 Gbits/sec on each unidirectional lane), the other with a 256-bit data-path (raw bandwidth 102.4 Gbits/sec on each unidirectional lane). Not all of this bandwidth can be fully utilised due to the overhead of control and request packets, and it is not always possible to achieve efficient packing of data into flow control digits. Some increase in bandwidth will be seen due to concurrent data transfers on the bus, but this can only be determined in system simulations.
- In the embodiment, latency of packets on the bus is one cycle at the sending interface unit, one cycle per bus block (nodes and repeaters) that data passes through, one or two additional cycle(s) at the node consume unit, one cycle at receiving interface unit and n−1 cycles, where n is the packet length in flow control digits. This figure gives the latency for the entire data transfer, meaning that latency increases with packet size.
- It is possible to control bandwidth allocation, by introducing programmability into the injection controller in the node.
- It is not possible for packets to be switched between bus lanes. Once a packet has entered the bus system, it stays on the same bus lane, until it is removed at the destination interface unit. Packets on the bus are never allowed to interleave. It is not necessary to inject new flow control digits on every clock cycle. In other words, it is possible to have gap cycles. These gaps will remain in the packet when it is inside the bus system, and will waste bandwidth. These packets will remain in the packet when it is in the bus system and unblocked, thus wasting bandwidth. If blocked then only validated data will concatenate and thus any intermediate non valid data will be removed. In addition to help minimise the number of gap packets, it is necessary to ensure that enough FIFO buffering is provided to allow the block to keep injecting flow control digits until each packet has been completely sent, or design the block in a manner that does not cause gaps to occur.
- In the system according to the present invention, the length of the packets (in flow control digits) is unlimited. However, consideration should be taken with excessively large packets, as they will utilises a greater amount of the bus resource. Long packets can be more efficient than a number of shorter ones, due to the overhead of having a header flow control digit.
- The interconnection system according to the present invention does not guarantee that requested data items would be returned to a module in the order in which they were requested. The block is responsible for re-ordering packets if the order matters. This is achieved using the advanced virtual component interface pktid field, which is used to tag and reorder outstanding transactions. It cannot be assumed that data will arrive at the on-chip block in the same order that it was requested from other blocks in the system. Where ordering is important, the receiving on-chip block must be able to re-order the packets. Failure to adhere to this rule is likely to result in system deadlock.
- The interconnection system according to the present invention offers considerable flexibility in the choice of interconnect topology. However, it is currently not advisable to have loops in the topology as these will introduce the possibility of deadlock.
- However, it should be possible to program routing tables in a deadlock-free manner if loops were to be used. This would require some method of proving deadlock freedom, together with software to implement the necessary checks.
- A further advantage of the interconnection system according to the present invention is that saturating the bus with packets will not cause it to fail. The packets will be delivered eventually, but the average latency will increase significantly. If the congestion is reduced to below the maximum throughput, then it will return to “normal operation”.
- In this repsect the following rules should be considered, namely there should be no loops in the bus topology, on-chip blocks must not depend on transactions being returned in order and where latency is important, and is multiple transactors need to use the same bus segments, there should be a maximum packet size. As mentioned above, if loops are required in the future, some deadlock prevention strategy must exist. Ideally this will include a formal proof. Further, if ordering is important, the blocks must be able to re-order the transaction. Two transactions from the same target, travelling on the same lane will be returned in the same order in which the target sent them. If requests were made to two different targets, the ordering is non-deterministic.
- Most of the interconnection components of the present invention will involve straightforward RTL synthesis followed by place & route. The interface units may be incorporated into either the on-chip blocks or an area reserved for the bus, depending on the requirements of the overall design, for example there may be a lot of area free under the bus and so using this area rather than adding the functional block area would make more sense as it would reduce the overall chip area.
- The interconnection system forms the basis of a system-on-chip platform. In order to accelerate the process of putting a system on-chip together, it has been proposed that the nodes contain the necessary “hooks” to handle distribution of the system clock reset signals. Looking first at an individual chip, the routing of transactions and responses between initiator and target is performed by the interface blocks that connect to the interconnection system, and any intervening T-switch elements in the system. Addressing of blocks is hardwired and geographic, and the routing information is compiled into the interface and T-switch logic at chip integration time. The platform requires some modularity at the chip level as well as the block level on chips. Therefore, knowledge of what other chips or their innards they are connected to can not be hard-coded in the chips themselves, as this may vary on different line cards.
- However, with the present invention, it is possible to provide flexibility of chip arrangement with hard-wired routing information by giving each chip some simple rules and designing the topology and enumeration of the chips to support this. This has the dual benefit of simplicity and of being a natural extension to the routing mechanisms within chips themselves.
- FIG. 11 illustrates an example of a linear chip arrangement. Of course, it is appreciated that different topologies can be realised according to the present invention. In such a linear arrangement, it is easy to number the chips1100(0) to (2) etc sequentially so that any block in any chip knows that a transaction must be routed “up” or “down” the
interconnection 1110 to reach its destination, as indicated in the chip ID-field of the physical address. This is exactly the same process as the block performs to route to another block on the same chip. In this case a 2-level decision is utilised. If in ‘present chip’ then route on Block ID, else route on Chip ID. - An alternative topology is shown in FIG. 12. It comprises a
first bus lane 1201 and asecond bus lane 1202 arranged in parallel. The first and second bus lane correspond to the interconnection system of the embodiment of the present invention. A plurality of multi threaded array processors (MTAPs) 1210 are connected across the twobus lanes network input device 1220, acollector device 1230, adistributor device 1240, annetwork output device 1250 are connected to thesecond bus lane 1202 and atable lookup engine 1260 is connected to thefirst bus lane 1201. Details of operation of the devices connected to the bus lanes is not provided here. - As illustarted in FIG. 12, in an alternative topology, the first bus lane1201 (256 bits wide, for example) is dedicated to fast path packet data. A second bus lane 1202 (128 bits wide, for example) is used for general non-packet data, such as table lookups, instruction fetching, external memory access, etc. Blocks accessing
bus lanes - More generally, where the blocks are connected to the interconnection and which lane or lanes they use can be selected. Allowance for floor planning constraints must obviously be taken into account. Blocks can have multiple bus interfaces, for example. Lane widths can be configured to meet the bandwidth requirements of the system.
- Since the interconnection system of the present invention uses point-to-point connections between interfaces, and uses distributed arbitration, it is possible to have several pairs of functional blocks communicating simultaneously without any contention or interference. Traffic between blocks can only interfere if that traffic travels along a shared bus segment in the same direction. This situation can be avoided by choosing a suitable layout. Thus, bus contention can be avoided in the fast path packet flow. This is important to achieve predictable and reliable performance, and to avoid overprovisioning the interconnection.
- The example above, avoids bus contention in the fast path, because the packet data flows left to right on
bus lane 1201 via NIP-DIS-MTAP-COL-NOP. Since packets do not cross any bus segment more than once, there is no bus contention. There is no interference between the MTAP processors, because only one at a time is sending or receiving. Another way to avoid bus contention is to place the MTAP processors on a “spur” off the main data path, as shown in FIG. 13. - This topology uses a T-
junction 1305 to exploit the fact that traffic going in opposite directions on thesame bus segment 1300 is non-interfering. Using the T-junction block 1305 may ease the design of the bus topology to account for layout and floor planning constraints. - At the lowest (hardware) level of abstraction, the interconnection system of the present invention preferrably supports advanced virtual component interface transactions, which are simply variable size messages as defined in the virtual component interface standard, sent from an initiator interface to a target interface, possibly followed by a response at a later time. Because the response may be delayed, this is called a split transaction in the virtual component interface system. The network processing system architecture defines two higher levels of abstraction in the inter-block communication protocol, the chunk and the abstract datagram (frequently simply called a datagram). A chunk is a logical entity that represents a fairly small amount of data to be transferred from one block to another. An abstract datagram is a logical entity that represents the natural unit of data for the application. In network processing applications, abstract datagrams almost always correspond to network datagrams or packets. The distinction is made to allow for using the architecture blocks in other applications besides networking. Chunks are somewhat analogous to CSIX C frames, and are used for similar purposes, that is, to have a convenient, small unit of data transfer. Chunks have a defined maximum size, typically about 512 bytes, while datagrams can be much larger, typically up to 9K bytes; the exact size limits are configurable. When a datagram needs to be transferred from one block to another, the actual transfer is done by sending a sequence of chunks. The chunks are packaged within a series of AVCI transactions at the bus interface.
- The system addressing scheme according to the embodiment of the present invention will now be described in more detail. The system according to an embodiment of the present invention may span a subsystem that is implemented in more than one chip.
- Looking first at an individual chip, the routing of transactions and responses between initiators and targets is performed by the interface blocks that connect to the interconnection itself, and the intervening T-switch elements in the interconnection. Addressing according to an embodiment of the present invention of the blocks is hardwired and geographic, and the routing information is compiled into the interface units, T-switch and node elements logic at chip integration.
- The interface ID, occupies the upper part of the physical 64 bit address, the lower bits being the offset within the block. Additional physical bits are reserved for the chip ID to support multi-chip expanses.
- The platform according to the embodiment of the present invention requires some modularity at the chip level as well as the block level on chips, knowledge of what other chips or their innards they are connected to can not be hard-coded, as this may vary on different line cards. This prevents the use of the same hard-wired bus routing information scheme as exists in the interface units for transactions within one chip.
- However, it is possible to provide flexibility of chip arrangement with hardwired routing information by giving each chip some simple rules and designing the topology and enumeration of chips to support this. This has the dual benefits of simplicity and of being a natural extension to the routing mechanisms within the chips themselves.
- An example of a traffic handler subsystem in which the packet queue memory is implemented around two memory hub chips is shown in FIGS. 14 and 15.
- In the example, the four chips have four connections to other chips. This results in possible ambiguities about the route that a transaction takes from one chip to the next. Therefore, it is necessary to control the flow of transactions by configuring the hardware and software appropriately, but without having to include: programmable routing functions in the interconnection.
- This is achieved by making the chip ID an x,y coordinate instead of a single number. For example, chip ID for
chip 1401 may be 4,2,chip 1403 may be 5,1,chip 1404 may be 5,3 andchip 1402 may be 6,2. Simple, hardwired rules are applied on about how to route the next hop of a transaction destined for another chip. Thus, locating the chips on a virtual “grid” such that the local rules produce the transaction flows desired. The grid can be “distorted” by leaving gaps or dislocations to achieve the desired effect. - Each chip has complete knowledge of itself, including how many external ports it has and their assignments to N,S,E,W compass points. This knowledge is wired into the inetrface units and T-switches. A chip has no knowledge at all of what other chips are connected to it, or their x,y coordinates.
- The local rules at each chip are this:
- 1. A transaction is routed out on the interface that forms the least angle with its relative grid location.
- 2. In the event of a tie, N-S interfaces are favoured over E-W.
- Applying these rules to the four chip example above, transactions along the main horizontal axis through Arrivals &
Dispatch chips - Transactions from Arrivals or
Dispatch memory hubs - Responses from
memory hubs - The conjecture is that there is no chip topology or transaction flow that cannot be expressed by suitable choice of chip coordinates and application of the above rules.
- Although a preferred embodiment of the method and system of the present invention has been illustrated in the accompanying drawings and described in the forgoing detailed description, it will be understood that the invention is not limited to the embodiment disclosed, but is capable of numerous variations, modifications without departing from the scope of the invention as set out in the following claims.
Claims (30)
1. An interconnection system for connecting a plurality of functional units, the interconnection system comprising a plurality of nodes, each node communicating with a functional unit, the interconnection system transporting a plurality of data packets between functional units, each data packet has routing information associated therewith to enable a node to direct the data packet via the interconnection system.
2. An interconnection system for interconnecting a plurality of functional units and transporting a plurality of data packets, the interconnection system comprising a plurality of nodes, each node communicating with a functional unit wherein, during transportation of the data packets between a first node and a second node, only the portion of the interconnection system between the first node and the second node is activated.
3. An interconnection system for interconnecting a plurality of functional units, each functional unit connected to the interconnection system via an interface unit, the interconnection system comprising a plurality of nodes, the interconnection system transporting a plurality of data packets wherein each interface unit translate the protocol for transporting the data packets and the protocol of the functional units.
4. An interconnection system for interconnecting a plurality of functional units and transporting a plurality of data packets between the functional units wherein arbitration is distributed to each functional unit.
5. An interconnection system according to any one of claims 2 to 4 , wherein each data packet has routing information associated therewith to enable a node to direct the data packet via the interconnection system.
6. An interconnection system according to any one of claims 1, 3 or 4, wherein, during transportation of a data packet between a first node and a second node, only the portion of the interconnection system between the first node and the second node is activated.
7. An interconnection system according to claims 1, 2 or 4, wherein the interconnection system is protocol agnostic.
8. An interconnection system according to any one of claims 1 to 3 , wherein arbitration is distributed between the functional units.
9. An interconnection system according to any one of the preceding claims further comprising a plurality of repeater units spaced along the interconnection at predetermined distances such that the data packets are transported between consecutive repeater units and/or nodes in a single clock cycle.
10. An interconnection system according to claim 9 , wherein the data packets are pipelined between the nodes and/or repeater units
11. An interconnection system according to claim 9 or 10, wherein each repeater unit comprises means to compress data upon blockage of the interconnection.
12. An interconnection system according to any one the preceding claims, wherein the routing information includes the x,y coordinates of the destination.
13. An interconnection system according to any one of the preceding claims, wherein the clocking along the length of the interconnection system is distributed.
14. An interconnection system according to any one of the preceding claims, wherein each node comprises an input buffer, inject control and/or consume control.
15. An interconnection system according to claim 14 , wherein each node can inject and output data at the same time.
16. An interconnection system according to any one of the preceding claims, wherein the interconnection system comprises a plurality of buses.
17. An interconnection system according to claim 16 , wherein each node is connected to at least one of the plurality of buses.
18. An interconnection system according to claim 16 or 17, wherein at least a part of each bus comprises a pair of unidirectional bus lanes.
19. An interconnection system according to claim 15 , wherein data is transported on each bus lane in an opposite direction.
20. An interconnection system according to any one of the preceding claims, further comprising at least one T-switch, the T-switch determining the direction to transport the data packets from the routing information associated with each data packet.
21. An interconnection system according to any one of the preceding claims, wherein delivery of the data packet is guaranteed.
22. A method for routing data packet between functional units, each data packet has routing information associated therewith, the method comprising the steps of:
(a) reading the routing information;
(b) determining the direction to transport the data packet from the routing information; and
(c) transporting the data packet in the direction determined in step (b).
23. A processing system incorporating the interconnection system according to any one of claims 1 to 21 .
24. A system according to claim 23 , wherein each functional unit is connected to a node via an interface unit.
25. A system according to claim 24 , wherein each interface unit comprises means to set the protocol for data to be transported to the interconnection system and to be received from the interconnection system.
26. A system according to any one of claims 24 to 25 , wherein the functional units access the interconnection system using distributed arbitration.
27. A system according to any one of claims 24 to 26 , wherein each functional unit comprises a reusable system on chip functional unit.
28. An integrated circuit incorporating the interconnection system according to any one of claim 1 to 21.
29. An integrated system comprising a plurality of chips, each chip incorporating the interconnection system according to any one of claims 1 to 21 , wherein the interconnection system interconnects the plurality of chips.
30. A method for transporting a plurality of data packets via an interconnection system, the interconnection system comprising a plurality of nodes, the method comprising the steps of:
transporting a data packet between a first node and a second node; and
during transportation, only activating the portion of the interconnection system between the first node and the second node.
Applications Claiming Priority (7)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB0103687.0 | 2001-02-14 | ||
GB0103678.9 | 2001-02-14 | ||
GB0103678A GB0103678D0 (en) | 2001-02-14 | 2001-02-14 | Network processing |
GB0103687A GB0103687D0 (en) | 2001-02-14 | 2001-02-14 | Network processing-architecture II |
GB0121790A GB0121790D0 (en) | 2001-02-14 | 2001-09-10 | Network processing systems |
GB0121790.0 | 2001-09-10 | ||
PCT/GB2002/000662 WO2002065700A2 (en) | 2001-02-14 | 2002-02-14 | An interconnection system |
Publications (1)
Publication Number | Publication Date |
---|---|
US20040114609A1 true US20040114609A1 (en) | 2004-06-17 |
Family
ID=27256074
Family Applications (10)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US10/073,948 Expired - Fee Related US7856543B2 (en) | 2001-02-14 | 2002-02-14 | Data processing architectures for packet handling wherein batches of data packets of unpredictable size are distributed across processing elements arranged in a SIMD array operable to process different respective packet protocols at once while executing a single common instruction stream |
US10/468,167 Abandoned US20040114609A1 (en) | 2001-02-14 | 2002-02-14 | Interconnection system |
US10/074,022 Abandoned US20020159466A1 (en) | 2001-02-14 | 2002-02-14 | Lookup engine |
US10/468,168 Expired - Fee Related US7290162B2 (en) | 2001-02-14 | 2002-02-14 | Clock distribution system |
US10/074,019 Abandoned US20020161926A1 (en) | 2001-02-14 | 2002-02-14 | Method for controlling the order of datagrams |
US11/151,271 Expired - Fee Related US8200686B2 (en) | 2001-02-14 | 2005-06-14 | Lookup engine |
US11/151,292 Abandoned US20050242976A1 (en) | 2001-02-14 | 2005-06-14 | Lookup engine |
US11/752,299 Expired - Fee Related US7818541B2 (en) | 2001-02-14 | 2007-05-23 | Data processing architectures |
US11/752,300 Expired - Fee Related US7917727B2 (en) | 2001-02-14 | 2007-05-23 | Data processing architectures for packet handling using a SIMD array |
US12/965,673 Expired - Fee Related US8127112B2 (en) | 2001-02-14 | 2010-12-10 | SIMD array operable to process different respective packet protocols simultaneously while executing a single common instruction stream |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US10/073,948 Expired - Fee Related US7856543B2 (en) | 2001-02-14 | 2002-02-14 | Data processing architectures for packet handling wherein batches of data packets of unpredictable size are distributed across processing elements arranged in a SIMD array operable to process different respective packet protocols at once while executing a single common instruction stream |
Family Applications After (8)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US10/074,022 Abandoned US20020159466A1 (en) | 2001-02-14 | 2002-02-14 | Lookup engine |
US10/468,168 Expired - Fee Related US7290162B2 (en) | 2001-02-14 | 2002-02-14 | Clock distribution system |
US10/074,019 Abandoned US20020161926A1 (en) | 2001-02-14 | 2002-02-14 | Method for controlling the order of datagrams |
US11/151,271 Expired - Fee Related US8200686B2 (en) | 2001-02-14 | 2005-06-14 | Lookup engine |
US11/151,292 Abandoned US20050242976A1 (en) | 2001-02-14 | 2005-06-14 | Lookup engine |
US11/752,299 Expired - Fee Related US7818541B2 (en) | 2001-02-14 | 2007-05-23 | Data processing architectures |
US11/752,300 Expired - Fee Related US7917727B2 (en) | 2001-02-14 | 2007-05-23 | Data processing architectures for packet handling using a SIMD array |
US12/965,673 Expired - Fee Related US8127112B2 (en) | 2001-02-14 | 2010-12-10 | SIMD array operable to process different respective packet protocols simultaneously while executing a single common instruction stream |
Country Status (6)
Country | Link |
---|---|
US (10) | US7856543B2 (en) |
JP (2) | JP2004524617A (en) |
CN (2) | CN1613041A (en) |
AU (1) | AU2002233500A1 (en) |
GB (5) | GB2390506B (en) |
WO (2) | WO2002065700A2 (en) |
Cited By (42)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040042496A1 (en) * | 2002-08-30 | 2004-03-04 | Intel Corporation | System including a segmentable, shared bus |
US20050216625A1 (en) * | 2004-03-09 | 2005-09-29 | Smith Zachary S | Suppressing production of bus transactions by a virtual-bus interface |
US7055123B1 (en) * | 2001-12-31 | 2006-05-30 | Richard S. Norman | High-performance interconnect arrangement for an array of discrete functional modules |
US20070017694A1 (en) * | 2005-07-20 | 2007-01-25 | Tomoyuki Kubo | Wiring board and manufacturing method for wiring board |
US20070047584A1 (en) * | 2005-08-24 | 2007-03-01 | Spink Aaron T | Interleaving data packets in a packet-based communication system |
US20080276116A1 (en) * | 2005-06-01 | 2008-11-06 | Tobias Bjerregaard | Method and an Apparatus for Providing Timing Signals to a Number of Circuits, an Integrated Circuit and a Node |
US20090089030A1 (en) * | 2007-09-28 | 2009-04-02 | Rockwell Automation Technologies, Inc. | Distributed simulation and synchronization |
US20090089029A1 (en) * | 2007-09-28 | 2009-04-02 | Rockwell Automation Technologies, Inc. | Enhanced execution speed to improve simulation performance |
US20090089227A1 (en) * | 2007-09-28 | 2009-04-02 | Rockwell Automation Technologies, Inc. | Automated recommendations from simulation |
US20090089031A1 (en) * | 2007-09-28 | 2009-04-02 | Rockwell Automation Technologies, Inc. | Integrated simulation of controllers and devices |
US20090089234A1 (en) * | 2007-09-28 | 2009-04-02 | Rockwell Automation Technologies, Inc. | Automated code generation for simulators |
US20090089027A1 (en) * | 2007-09-28 | 2009-04-02 | Rockwell Automation Technologies, Inc. | Simulation controls for model variablity and randomness |
US20090268736A1 (en) * | 2008-04-24 | 2009-10-29 | Allison Brian D | Early header CRC in data response packets with variable gap count |
US20090271532A1 (en) * | 2008-04-24 | 2009-10-29 | Allison Brian D | Early header CRC in data response packets with variable gap count |
US20090268727A1 (en) * | 2008-04-24 | 2009-10-29 | Allison Brian D | Early header CRC in data response packets with variable gap count |
US20100241746A1 (en) * | 2005-02-23 | 2010-09-23 | International Business Machines Corporation | Method, Program and System for Efficiently Hashing Packet Keys into a Firewall Connection Table |
US20100278195A1 (en) * | 2009-04-29 | 2010-11-04 | Mahesh Wagh | Packetized Interface For Coupling Agents |
US7995618B1 (en) * | 2007-10-01 | 2011-08-09 | Teklatech A/S | System and a method of transmitting data from a first device to a second device |
US20130038427A1 (en) * | 2010-03-12 | 2013-02-14 | Zte Corporation | Sight Spot Guiding System and Implementation Method Thereof |
US20130229290A1 (en) * | 2012-03-01 | 2013-09-05 | Eaton Corporation | Instrument panel bus interface |
US20150012679A1 (en) * | 2013-07-03 | 2015-01-08 | Iii Holdings 2, Llc | Implementing remote transaction functionalities between data processing nodes of a switched interconnect fabric |
US10243882B1 (en) | 2017-04-13 | 2019-03-26 | Xilinx, Inc. | Network on chip switch interconnect |
US10505548B1 (en) | 2018-05-25 | 2019-12-10 | Xilinx, Inc. | Multi-chip structure having configurable network-on-chip |
US10503690B2 (en) | 2018-02-23 | 2019-12-10 | Xilinx, Inc. | Programmable NOC compatible with multiple interface communication protocol |
US10621129B2 (en) | 2018-03-27 | 2020-04-14 | Xilinx, Inc. | Peripheral interconnect for configurable slave endpoint circuits |
US10673745B2 (en) | 2018-02-01 | 2020-06-02 | Xilinx, Inc. | End-to-end quality-of-service in a network-on-chip |
US10680615B1 (en) | 2019-03-27 | 2020-06-09 | Xilinx, Inc. | Circuit for and method of configuring and partially reconfiguring function blocks of an integrated circuit device |
US20200250281A1 (en) * | 2019-02-05 | 2020-08-06 | Arm Limited | Integrated circuit design and fabrication |
US10824505B1 (en) | 2018-08-21 | 2020-11-03 | Xilinx, Inc. | ECC proxy extension and byte organization for multi-master systems |
US10838908B2 (en) | 2018-07-20 | 2020-11-17 | Xilinx, Inc. | Configurable network-on-chip for a programmable device |
US10891414B2 (en) | 2019-05-23 | 2021-01-12 | Xilinx, Inc. | Hardware-software design flow for heterogeneous and programmable devices |
US10891132B2 (en) | 2019-05-23 | 2021-01-12 | Xilinx, Inc. | Flow convergence during hardware-software design for heterogeneous and programmable devices |
US10936486B1 (en) | 2019-02-21 | 2021-03-02 | Xilinx, Inc. | Address interleave support in a programmable device |
US10963460B2 (en) | 2018-12-06 | 2021-03-30 | Xilinx, Inc. | Integrated circuits and methods to accelerate data queries |
US10977018B1 (en) | 2019-12-05 | 2021-04-13 | Xilinx, Inc. | Development environment for heterogeneous devices |
US11188312B2 (en) | 2019-05-23 | 2021-11-30 | Xilinx, Inc. | Hardware-software design flow with high-level synthesis for heterogeneous and programmable devices |
US20220027308A1 (en) * | 2019-05-09 | 2022-01-27 | SambaNova Systems, Inc. | Control Barrier Network for Reconfigurable Data Processors |
US11301295B1 (en) | 2019-05-23 | 2022-04-12 | Xilinx, Inc. | Implementing an application specified as a data flow graph in an array of data processing engines |
US11336287B1 (en) | 2021-03-09 | 2022-05-17 | Xilinx, Inc. | Data processing engine array architecture with memory tiles |
US11496418B1 (en) | 2020-08-25 | 2022-11-08 | Xilinx, Inc. | Packet-based and time-multiplexed network-on-chip |
US11520717B1 (en) | 2021-03-09 | 2022-12-06 | Xilinx, Inc. | Memory tiles in data processing engine array |
US11848670B2 (en) | 2022-04-15 | 2023-12-19 | Xilinx, Inc. | Multiple partitions in a data processing array |
Families Citing this family (232)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7549056B2 (en) | 1999-03-19 | 2009-06-16 | Broadcom Corporation | System and method for processing and protecting content |
US7174452B2 (en) * | 2001-01-24 | 2007-02-06 | Broadcom Corporation | Method for processing multiple security policies applied to a data packet structure |
US7856543B2 (en) | 2001-02-14 | 2010-12-21 | Rambus Inc. | Data processing architectures for packet handling wherein batches of data packets of unpredictable size are distributed across processing elements arranged in a SIMD array operable to process different respective packet protocols at once while executing a single common instruction stream |
US7383421B2 (en) * | 2002-12-05 | 2008-06-03 | Brightscale, Inc. | Cellular engine for a data processing system |
US7107478B2 (en) * | 2002-12-05 | 2006-09-12 | Connex Technology, Inc. | Data processing system having a Cartesian Controller |
US20030078997A1 (en) * | 2001-10-22 | 2003-04-24 | Franzel Kenneth S. | Module and unified network backplane interface for local networks |
FI113113B (en) | 2001-11-20 | 2004-02-27 | Nokia Corp | Method and device for time synchronization of integrated circuits |
US6836808B2 (en) * | 2002-02-25 | 2004-12-28 | International Business Machines Corporation | Pipelined packet processing |
US7627693B2 (en) * | 2002-06-11 | 2009-12-01 | Pandya Ashish A | IP storage processor and engine therefor using RDMA |
US7415723B2 (en) * | 2002-06-11 | 2008-08-19 | Pandya Ashish A | Distributed network security system and a hardware processor therefor |
US7408957B2 (en) * | 2002-06-13 | 2008-08-05 | International Business Machines Corporation | Selective header field dispatch in a network processing system |
US8015303B2 (en) * | 2002-08-02 | 2011-09-06 | Astute Networks Inc. | High data rate stateful protocol processing |
US7684400B2 (en) * | 2002-08-08 | 2010-03-23 | Intel Corporation | Logarithmic time range-based multifield-correlation packet classification |
US20040066779A1 (en) * | 2002-10-04 | 2004-04-08 | Craig Barrack | Method and implementation for context switchover |
US8037224B2 (en) | 2002-10-08 | 2011-10-11 | Netlogic Microsystems, Inc. | Delegating network processor operations to star topology serial bus interfaces |
US20050044324A1 (en) * | 2002-10-08 | 2005-02-24 | Abbas Rashid | Advanced processor with mechanism for maximizing resource usage in an in-order pipeline with multiple threads |
US7346757B2 (en) | 2002-10-08 | 2008-03-18 | Rmi Corporation | Advanced processor translation lookaside buffer management in a multithreaded system |
US9088474B2 (en) * | 2002-10-08 | 2015-07-21 | Broadcom Corporation | Advanced processor with interfacing messaging network to a CPU |
US8015567B2 (en) | 2002-10-08 | 2011-09-06 | Netlogic Microsystems, Inc. | Advanced processor with mechanism for packet distribution at high line rate |
US20050033831A1 (en) * | 2002-10-08 | 2005-02-10 | Abbas Rashid | Advanced processor with a thread aware return address stack optimally used across active threads |
US8478811B2 (en) | 2002-10-08 | 2013-07-02 | Netlogic Microsystems, Inc. | Advanced processor with credit based scheme for optimal packet flow in a multi-processor system on a chip |
US7984268B2 (en) | 2002-10-08 | 2011-07-19 | Netlogic Microsystems, Inc. | Advanced processor scheduling in a multithreaded system |
US7924828B2 (en) * | 2002-10-08 | 2011-04-12 | Netlogic Microsystems, Inc. | Advanced processor with mechanism for fast packet queuing operations |
US7627721B2 (en) | 2002-10-08 | 2009-12-01 | Rmi Corporation | Advanced processor with cache coherency |
US7961723B2 (en) * | 2002-10-08 | 2011-06-14 | Netlogic Microsystems, Inc. | Advanced processor with mechanism for enforcing ordering between information sent on two independent networks |
US7334086B2 (en) * | 2002-10-08 | 2008-02-19 | Rmi Corporation | Advanced processor with system on a chip interconnect technology |
US8176298B2 (en) | 2002-10-08 | 2012-05-08 | Netlogic Microsystems, Inc. | Multi-core multi-threaded processing systems with instruction reordering in an in-order pipeline |
US7596621B1 (en) * | 2002-10-17 | 2009-09-29 | Astute Networks, Inc. | System and method for managing shared state using multiple programmed processors |
US7814218B1 (en) | 2002-10-17 | 2010-10-12 | Astute Networks, Inc. | Multi-protocol and multi-format stateful processing |
US8151278B1 (en) | 2002-10-17 | 2012-04-03 | Astute Networks, Inc. | System and method for timer management in a stateful protocol processing system |
ATE438242T1 (en) * | 2002-10-31 | 2009-08-15 | Alcatel Lucent | METHOD FOR PROCESSING DATA PACKETS AT LAYER THREE IN A TELECOMMUNICATIONS DEVICE |
US7715392B2 (en) * | 2002-12-12 | 2010-05-11 | Stmicroelectronics, Inc. | System and method for path compression optimization in a pipelined hardware bitmapped multi-bit trie algorithmic network search engine |
JP4157403B2 (en) * | 2003-03-19 | 2008-10-01 | 株式会社日立製作所 | Packet communication device |
US8477780B2 (en) * | 2003-03-26 | 2013-07-02 | Alcatel Lucent | Processing packet information using an array of processing elements |
US8539089B2 (en) * | 2003-04-23 | 2013-09-17 | Oracle America, Inc. | System and method for vertical perimeter protection |
EP1623330A2 (en) | 2003-05-07 | 2006-02-08 | Koninklijke Philips Electronics N.V. | Processing system and method for transmitting data |
US7558268B2 (en) * | 2003-05-08 | 2009-07-07 | Samsung Electronics Co., Ltd. | Apparatus and method for combining forwarding tables in a distributed architecture router |
US7500239B2 (en) * | 2003-05-23 | 2009-03-03 | Intel Corporation | Packet processing system |
US20050108518A1 (en) * | 2003-06-10 | 2005-05-19 | Pandya Ashish A. | Runtime adaptable security processor |
US7349958B2 (en) * | 2003-06-25 | 2008-03-25 | International Business Machines Corporation | Method for improving performance in a computer storage system by regulating resource requests from clients |
US7174398B2 (en) * | 2003-06-26 | 2007-02-06 | International Business Machines Corporation | Method and apparatus for implementing data mapping with shuffle algorithm |
US7702882B2 (en) * | 2003-09-10 | 2010-04-20 | Samsung Electronics Co., Ltd. | Apparatus and method for performing high-speed lookups in a routing table |
CA2442803A1 (en) * | 2003-09-26 | 2005-03-26 | Ibm Canada Limited - Ibm Canada Limitee | Structure and method for managing workshares in a parallel region |
US7886307B1 (en) * | 2003-09-26 | 2011-02-08 | The Mathworks, Inc. | Object-oriented data transfer system for data sharing |
US7120815B2 (en) * | 2003-10-31 | 2006-10-10 | Hewlett-Packard Development Company, L.P. | Clock circuitry on plural integrated circuits |
US7634500B1 (en) | 2003-11-03 | 2009-12-15 | Netlogic Microsystems, Inc. | Multiple string searching using content addressable memory |
AU2004297923B2 (en) * | 2003-11-26 | 2008-07-10 | Cisco Technology, Inc. | Method and apparatus to inline encryption and decryption for a wireless station |
US6954450B2 (en) * | 2003-11-26 | 2005-10-11 | Cisco Technology, Inc. | Method and apparatus to provide data streaming over a network connection in a wireless MAC processor |
US7340548B2 (en) | 2003-12-17 | 2008-03-04 | Microsoft Corporation | On-chip bus |
US7058424B2 (en) * | 2004-01-20 | 2006-06-06 | Lucent Technologies Inc. | Method and apparatus for interconnecting wireless and wireline networks |
GB0403237D0 (en) * | 2004-02-13 | 2004-03-17 | Imec Inter Uni Micro Electr | A method for realizing ground bounce reduction in digital circuits adapted according to said method |
US7903777B1 (en) * | 2004-03-03 | 2011-03-08 | Marvell International Ltd. | System and method for reducing electromagnetic interference and ground bounce in an information communication system by controlling phase of clock signals among a plurality of information communication devices |
US7478109B1 (en) * | 2004-03-15 | 2009-01-13 | Cisco Technology, Inc. | Identification of a longest matching prefix based on a search of intervals corresponding to the prefixes |
KR100990484B1 (en) * | 2004-03-29 | 2010-10-29 | 삼성전자주식회사 | Transmission clock signal generator for serial bus communication |
US20050254486A1 (en) * | 2004-05-13 | 2005-11-17 | Ittiam Systems (P) Ltd. | Multi processor implementation for signals requiring fast processing |
DE102004035843B4 (en) * | 2004-07-23 | 2010-04-15 | Infineon Technologies Ag | Router Network Processor |
GB2417105B (en) | 2004-08-13 | 2008-04-09 | Clearspeed Technology Plc | Processor memory system |
US7913206B1 (en) * | 2004-09-16 | 2011-03-22 | Cadence Design Systems, Inc. | Method and mechanism for performing partitioning of DRC operations |
US7508397B1 (en) * | 2004-11-10 | 2009-03-24 | Nvidia Corporation | Rendering of disjoint and overlapping blits |
US8170019B2 (en) * | 2004-11-30 | 2012-05-01 | Broadcom Corporation | CPU transmission of unmodified packets |
US20060156316A1 (en) * | 2004-12-18 | 2006-07-13 | Gray Area Technologies | System and method for application specific array processing |
US20060212426A1 (en) * | 2004-12-21 | 2006-09-21 | Udaya Shakara | Efficient CAM-based techniques to perform string searches in packet payloads |
US7818705B1 (en) | 2005-04-08 | 2010-10-19 | Altera Corporation | Method and apparatus for implementing a field programmable gate array architecture with programmable clock skew |
WO2006127596A2 (en) * | 2005-05-20 | 2006-11-30 | Hillcrest Laboratories, Inc. | Dynamic hyperlinking approach |
US7373475B2 (en) * | 2005-06-21 | 2008-05-13 | Intel Corporation | Methods for optimizing memory unit usage to maximize packet throughput for multi-processor multi-threaded architectures |
US20070086456A1 (en) * | 2005-08-12 | 2007-04-19 | Electronics And Telecommunications Research Institute | Integrated layer frame processing device including variable protocol header |
US7904852B1 (en) | 2005-09-12 | 2011-03-08 | Cadence Design Systems, Inc. | Method and system for implementing parallel processing of electronic design automation tools |
US8218770B2 (en) * | 2005-09-13 | 2012-07-10 | Agere Systems Inc. | Method and apparatus for secure key management and protection |
US7353332B2 (en) * | 2005-10-11 | 2008-04-01 | Integrated Device Technology, Inc. | Switching circuit implementing variable string matching |
US7451293B2 (en) * | 2005-10-21 | 2008-11-11 | Brightscale Inc. | Array of Boolean logic controlled processing elements with concurrent I/O processing and instruction sequencing |
US7551609B2 (en) * | 2005-10-21 | 2009-06-23 | Cisco Technology, Inc. | Data structure for storing and accessing multiple independent sets of forwarding information |
US7835359B2 (en) * | 2005-12-08 | 2010-11-16 | International Business Machines Corporation | Method and apparatus for striping message payload data over a network |
JP2009523292A (en) * | 2006-01-10 | 2009-06-18 | ブライトスケール インコーポレイテッド | Method and apparatus for scheduling multimedia data processing in parallel processing systems |
US20070162531A1 (en) * | 2006-01-12 | 2007-07-12 | Bhaskar Kota | Flow transform for integrated circuit design and simulation having combined data flow, control flow, and memory flow views |
US8301885B2 (en) * | 2006-01-27 | 2012-10-30 | Fts Computertechnik Gmbh | Time-controlled secure communication |
KR20070088190A (en) * | 2006-02-24 | 2007-08-29 | 삼성전자주식회사 | Subword parallelism for processing multimedia data |
EP2000973B1 (en) * | 2006-03-30 | 2013-05-01 | NEC Corporation | Parallel image processing system control method and apparatus |
US7617409B2 (en) * | 2006-05-01 | 2009-11-10 | Arm Limited | System for checking clock-signal correspondence |
US8390354B2 (en) * | 2006-05-17 | 2013-03-05 | Freescale Semiconductor, Inc. | Delay configurable device and methods thereof |
US8041929B2 (en) * | 2006-06-16 | 2011-10-18 | Cisco Technology, Inc. | Techniques for hardware-assisted multi-threaded processing |
JP2008004046A (en) * | 2006-06-26 | 2008-01-10 | Toshiba Corp | Resource management device, and program for the same |
US7584286B2 (en) * | 2006-06-28 | 2009-09-01 | Intel Corporation | Flexible and extensible receive side scaling |
US8448096B1 (en) | 2006-06-30 | 2013-05-21 | Cadence Design Systems, Inc. | Method and system for parallel processing of IC design layouts |
US7516437B1 (en) * | 2006-07-20 | 2009-04-07 | Xilinx, Inc. | Skew-driven routing for networks |
CN1909418B (en) * | 2006-08-01 | 2010-05-12 | 华为技术有限公司 | Clock distributing equipment for universal wireless interface and method for realizing speed switching |
US20080040214A1 (en) * | 2006-08-10 | 2008-02-14 | Ip Commerce | System and method for subsidizing payment transaction costs through online advertising |
JP4846486B2 (en) * | 2006-08-18 | 2011-12-28 | 富士通株式会社 | Information processing apparatus and control method thereof |
CA2557343C (en) * | 2006-08-28 | 2015-09-22 | Ibm Canada Limited-Ibm Canada Limitee | Runtime code modification in a multi-threaded environment |
WO2008027567A2 (en) * | 2006-09-01 | 2008-03-06 | Brightscale, Inc. | Integral parallel machine |
US20080059762A1 (en) * | 2006-09-01 | 2008-03-06 | Bogdan Mitu | Multi-sequence control for a data parallel system |
US9563433B1 (en) | 2006-09-01 | 2017-02-07 | Allsearch Semi Llc | System and method for class-based execution of an instruction broadcasted to an array of processing elements |
US20080055307A1 (en) * | 2006-09-01 | 2008-03-06 | Lazar Bivolarski | Graphics rendering pipeline |
US20080244238A1 (en) * | 2006-09-01 | 2008-10-02 | Bogdan Mitu | Stream processing accelerator |
US20080059763A1 (en) * | 2006-09-01 | 2008-03-06 | Lazar Bivolarski | System and method for fine-grain instruction parallelism for increased efficiency of processing compressed multimedia data |
US20080059467A1 (en) * | 2006-09-05 | 2008-03-06 | Lazar Bivolarski | Near full motion search algorithm |
US7657856B1 (en) | 2006-09-12 | 2010-02-02 | Cadence Design Systems, Inc. | Method and system for parallel processing of IC design layouts |
US7783654B1 (en) | 2006-09-19 | 2010-08-24 | Netlogic Microsystems, Inc. | Multiple string searching using content addressable memory |
JP4377899B2 (en) * | 2006-09-20 | 2009-12-02 | 株式会社東芝 | Resource management apparatus and program |
US8010966B2 (en) * | 2006-09-27 | 2011-08-30 | Cisco Technology, Inc. | Multi-threaded processing using path locks |
US8179896B2 (en) | 2006-11-09 | 2012-05-15 | Justin Mark Sobaje | Network processors and pipeline optimization methods |
US7996348B2 (en) | 2006-12-08 | 2011-08-09 | Pandya Ashish A | 100GBPS security and search architecture using programmable intelligent search memory (PRISM) that comprises one or more bit interval counters |
US9141557B2 (en) | 2006-12-08 | 2015-09-22 | Ashish A. Pandya | Dynamic random access memory (DRAM) that comprises a programmable intelligent search memory (PRISM) and a cryptography processing engine |
JP4249780B2 (en) * | 2006-12-26 | 2009-04-08 | 株式会社東芝 | Device and program for managing resources |
US7676444B1 (en) | 2007-01-18 | 2010-03-09 | Netlogic Microsystems, Inc. | Iterative compare operations using next success size bitmap |
ATE508415T1 (en) * | 2007-03-06 | 2011-05-15 | Nec Corp | DATA TRANSFER NETWORK AND CONTROL DEVICE FOR A SYSTEM HAVING AN ARRAY OF PROCESSING ELEMENTS EACH EITHER SELF-CONTROLLED OR JOINTLY CONTROLLED |
JP2009086733A (en) * | 2007-09-27 | 2009-04-23 | Toshiba Corp | Information processor, control method of information processor and control program of information processor |
US8515052B2 (en) * | 2007-12-17 | 2013-08-20 | Wai Wu | Parallel signal processing system and method |
US9596324B2 (en) | 2008-02-08 | 2017-03-14 | Broadcom Corporation | System and method for parsing and allocating a plurality of packets to processor core threads |
US8250578B2 (en) * | 2008-02-22 | 2012-08-21 | International Business Machines Corporation | Pipelining hardware accelerators to computer systems |
US8726289B2 (en) * | 2008-02-22 | 2014-05-13 | International Business Machines Corporation | Streaming attachment of hardware accelerators to computer systems |
CN102077493B (en) * | 2008-04-30 | 2015-01-14 | 惠普开发有限公司 | Intentionally skewed optical clock signal distribution |
JP2009271724A (en) * | 2008-05-07 | 2009-11-19 | Toshiba Corp | Hardware engine controller |
US9619428B2 (en) | 2008-05-30 | 2017-04-11 | Advanced Micro Devices, Inc. | SIMD processing unit with local data share and access to a global data share of a GPU |
US8958419B2 (en) * | 2008-06-16 | 2015-02-17 | Intel Corporation | Switch fabric primitives |
US8566487B2 (en) * | 2008-06-24 | 2013-10-22 | Hartvig Ekner | System and method for creating a scalable monolithic packet processing engine |
US7949007B1 (en) | 2008-08-05 | 2011-05-24 | Xilinx, Inc. | Methods of clustering actions for manipulating packets of a communication protocol |
US7804844B1 (en) * | 2008-08-05 | 2010-09-28 | Xilinx, Inc. | Dataflow pipeline implementing actions for manipulating packets of a communication protocol |
US8311057B1 (en) | 2008-08-05 | 2012-11-13 | Xilinx, Inc. | Managing formatting of packets of a communication protocol |
US8160092B1 (en) | 2008-08-05 | 2012-04-17 | Xilinx, Inc. | Transforming a declarative description of a packet processor |
EP2327026A1 (en) * | 2008-08-06 | 2011-06-01 | Nxp B.V. | Simd parallel processor architecture |
CN101355482B (en) * | 2008-09-04 | 2011-09-21 | 中兴通讯股份有限公司 | Equipment, method and system for implementing identification of embedded device address sequence |
US8493979B2 (en) * | 2008-12-30 | 2013-07-23 | Intel Corporation | Single instruction processing of network packets |
JP5238525B2 (en) * | 2009-01-13 | 2013-07-17 | 株式会社東芝 | Device and program for managing resources |
KR101553652B1 (en) * | 2009-02-18 | 2015-09-16 | 삼성전자 주식회사 | Apparatus and method for compiling instruction for heterogeneous processor |
US8140792B2 (en) * | 2009-02-25 | 2012-03-20 | International Business Machines Corporation | Indirectly-accessed, hardware-affine channel storage in transaction-oriented DMA-intensive environments |
US8874878B2 (en) * | 2010-05-18 | 2014-10-28 | Lsi Corporation | Thread synchronization in a multi-thread, multi-flow network communications processor architecture |
US9461930B2 (en) | 2009-04-27 | 2016-10-04 | Intel Corporation | Modifying data streams without reordering in a multi-thread, multi-flow network processor |
US8743877B2 (en) * | 2009-12-21 | 2014-06-03 | Steven L. Pope | Header processing engine |
US8332460B2 (en) * | 2010-04-14 | 2012-12-11 | International Business Machines Corporation | Performing a local reduction operation on a parallel computer |
KR20130141446A (en) * | 2010-07-19 | 2013-12-26 | 어드밴스드 마이크로 디바이시즈, 인코포레이티드 | Data processing using on-chip memory in multiple processing units |
US8880507B2 (en) * | 2010-07-22 | 2014-11-04 | Brocade Communications Systems, Inc. | Longest prefix match using binary search tree |
US8904115B2 (en) * | 2010-09-28 | 2014-12-02 | Texas Instruments Incorporated | Cache with multiple access pipelines |
RU2436151C1 (en) * | 2010-11-01 | 2011-12-10 | Федеральное государственное унитарное предприятие "Российский Федеральный ядерный центр - Всероссийский научно-исследовательский институт экспериментальной физики" (ФГУП "РФЯЦ-ВНИИЭФ") | Method of determining structure of hybrid computer system |
US9667539B2 (en) * | 2011-01-17 | 2017-05-30 | Alcatel Lucent | Method and apparatus for providing transport of customer QoS information via PBB networks |
US8869162B2 (en) | 2011-04-26 | 2014-10-21 | Microsoft Corporation | Stream processing on heterogeneous hardware devices |
US9020892B2 (en) * | 2011-07-08 | 2015-04-28 | Microsoft Technology Licensing, Llc | Efficient metadata storage |
US8880494B2 (en) | 2011-07-28 | 2014-11-04 | Brocade Communications Systems, Inc. | Longest prefix match scheme |
US8923306B2 (en) | 2011-08-02 | 2014-12-30 | Cavium, Inc. | Phased bucket pre-fetch in a network processor |
US9344366B2 (en) | 2011-08-02 | 2016-05-17 | Cavium, Inc. | System and method for rule matching in a processor |
US8910178B2 (en) | 2011-08-10 | 2014-12-09 | International Business Machines Corporation | Performing a global barrier operation in a parallel computer |
US9154335B2 (en) * | 2011-11-08 | 2015-10-06 | Marvell Israel (M.I.S.L) Ltd. | Method and apparatus for transmitting data on a network |
US9542236B2 (en) | 2011-12-29 | 2017-01-10 | Oracle International Corporation | Efficiency sequencer for multiple concurrently-executing threads of execution |
WO2013100783A1 (en) * | 2011-12-29 | 2013-07-04 | Intel Corporation | Method and system for control signalling in a data path module |
US9495135B2 (en) | 2012-02-09 | 2016-11-15 | International Business Machines Corporation | Developing collective operations for a parallel computer |
US9178730B2 (en) | 2012-02-24 | 2015-11-03 | Freescale Semiconductor, Inc. | Clock distribution module, synchronous digital system and method therefor |
JP6353359B2 (en) * | 2012-03-23 | 2018-07-04 | 株式会社Mush−A | Data processing apparatus, data processing system, data structure, recording medium, storage device, and data processing method |
JP2013222364A (en) * | 2012-04-18 | 2013-10-28 | Renesas Electronics Corp | Signal processing circuit |
US8775727B2 (en) | 2012-08-31 | 2014-07-08 | Lsi Corporation | Lookup engine with pipelined access, speculative add and lock-in-hit function |
US9082078B2 (en) | 2012-07-27 | 2015-07-14 | The Intellisis Corporation | Neural processing engine and architecture using the same |
CN103631315A (en) * | 2012-08-22 | 2014-03-12 | 上海华虹集成电路有限责任公司 | Clock design method facilitating timing sequence repair |
US9185057B2 (en) * | 2012-12-05 | 2015-11-10 | The Intellisis Corporation | Smart memory |
US9639371B2 (en) * | 2013-01-29 | 2017-05-02 | Advanced Micro Devices, Inc. | Solution to divergent branches in a SIMD core using hardware pointers |
US9391893B2 (en) | 2013-02-26 | 2016-07-12 | Dell Products L.P. | Lookup engine for an information handling system |
US20140269690A1 (en) * | 2013-03-13 | 2014-09-18 | Qualcomm Incorporated | Network element with distributed flow tables |
US9185003B1 (en) * | 2013-05-02 | 2015-11-10 | Amazon Technologies, Inc. | Distributed clock network with time synchronization and activity tracing between nodes |
US10331583B2 (en) | 2013-09-26 | 2019-06-25 | Intel Corporation | Executing distributed memory operations using processing elements connected by distributed channels |
ES2649163T3 (en) * | 2013-10-11 | 2018-01-10 | Wpt Gmbh | Elastic floor covering in the form of a continuously rolling material |
US20150120224A1 (en) * | 2013-10-29 | 2015-04-30 | C3 Energy, Inc. | Systems and methods for processing data relating to energy usage |
EP3075135A1 (en) | 2013-11-29 | 2016-10-05 | Nec Corporation | Apparatus, system and method for mtc |
US9547553B1 (en) | 2014-03-10 | 2017-01-17 | Parallel Machines Ltd. | Data resiliency in a shared memory pool |
US9372724B2 (en) * | 2014-04-01 | 2016-06-21 | Freescale Semiconductor, Inc. | System and method for conditional task switching during ordering scope transitions |
US9372723B2 (en) * | 2014-04-01 | 2016-06-21 | Freescale Semiconductor, Inc. | System and method for conditional task switching during ordering scope transitions |
US9781027B1 (en) | 2014-04-06 | 2017-10-03 | Parallel Machines Ltd. | Systems and methods to communicate with external destinations via a memory network |
US9690713B1 (en) | 2014-04-22 | 2017-06-27 | Parallel Machines Ltd. | Systems and methods for effectively interacting with a flash memory |
US9529622B1 (en) | 2014-12-09 | 2016-12-27 | Parallel Machines Ltd. | Systems and methods for automatic generation of task-splitting code |
US9477412B1 (en) | 2014-12-09 | 2016-10-25 | Parallel Machines Ltd. | Systems and methods for automatically aggregating write requests |
US9733981B2 (en) | 2014-06-10 | 2017-08-15 | Nxp Usa, Inc. | System and method for conditional task switching during ordering scope transitions |
US9753873B1 (en) | 2014-12-09 | 2017-09-05 | Parallel Machines Ltd. | Systems and methods for key-value transactions |
US9781225B1 (en) | 2014-12-09 | 2017-10-03 | Parallel Machines Ltd. | Systems and methods for cache streams |
US9639473B1 (en) | 2014-12-09 | 2017-05-02 | Parallel Machines Ltd. | Utilizing a cache mechanism by copying a data set from a cache-disabled memory location to a cache-enabled memory location |
US9632936B1 (en) | 2014-12-09 | 2017-04-25 | Parallel Machines Ltd. | Two-tier distributed memory |
WO2016118979A2 (en) | 2015-01-23 | 2016-07-28 | C3, Inc. | Systems, methods, and devices for an enterprise internet-of-things application development platform |
US9552327B2 (en) | 2015-01-29 | 2017-01-24 | Knuedge Incorporated | Memory controller for a network on a chip device |
US10061531B2 (en) | 2015-01-29 | 2018-08-28 | Knuedge Incorporated | Uniform system wide addressing for a computing system |
US9749225B2 (en) * | 2015-04-17 | 2017-08-29 | Huawei Technologies Co., Ltd. | Software defined network (SDN) control signaling for traffic engineering to enable multi-type transport in a data plane |
US20160381136A1 (en) * | 2015-06-24 | 2016-12-29 | Futurewei Technologies, Inc. | System, method, and computer program for providing rest services to fine-grained resources based on a resource-oriented network |
CN106326967B (en) * | 2015-06-29 | 2023-05-05 | 四川谦泰仁投资管理有限公司 | RFID chip with interactive switch input port |
US10313231B1 (en) * | 2016-02-08 | 2019-06-04 | Barefoot Networks, Inc. | Resilient hashing for forwarding packets |
US10063407B1 (en) | 2016-02-08 | 2018-08-28 | Barefoot Networks, Inc. | Identifying and marking failed egress links in data plane |
US10027583B2 (en) | 2016-03-22 | 2018-07-17 | Knuedge Incorporated | Chained packet sequences in a network on a chip architecture |
US9595308B1 (en) | 2016-03-31 | 2017-03-14 | Altera Corporation | Multiple-die synchronous insertion delay measurement circuit and methods |
US10346049B2 (en) | 2016-04-29 | 2019-07-09 | Friday Harbor Llc | Distributed contiguous reads in a network on a chip architecture |
US10402168B2 (en) | 2016-10-01 | 2019-09-03 | Intel Corporation | Low energy consumption mantissa multiplication for floating point multiply-add operations |
US10664942B2 (en) * | 2016-10-21 | 2020-05-26 | Advanced Micro Devices, Inc. | Reconfigurable virtual graphics and compute processor pipeline |
US10084687B1 (en) | 2016-11-17 | 2018-09-25 | Barefoot Networks, Inc. | Weighted-cost multi-pathing using range lookups |
US10416999B2 (en) | 2016-12-30 | 2019-09-17 | Intel Corporation | Processors, methods, and systems with a configurable spatial accelerator |
US10474375B2 (en) | 2016-12-30 | 2019-11-12 | Intel Corporation | Runtime address disambiguation in acceleration hardware |
US10558575B2 (en) | 2016-12-30 | 2020-02-11 | Intel Corporation | Processors, methods, and systems with a configurable spatial accelerator |
US10572376B2 (en) | 2016-12-30 | 2020-02-25 | Intel Corporation | Memory ordering in acceleration hardware |
US10237206B1 (en) | 2017-03-05 | 2019-03-19 | Barefoot Networks, Inc. | Equal cost multiple path group failover for multicast |
US10404619B1 (en) | 2017-03-05 | 2019-09-03 | Barefoot Networks, Inc. | Link aggregation group failover for multicast |
US10296351B1 (en) * | 2017-03-15 | 2019-05-21 | Ambarella, Inc. | Computer vision processing in hardware data paths |
CN107704922B (en) * | 2017-04-19 | 2020-12-08 | 赛灵思公司 | Artificial neural network processing device |
CN107679621B (en) | 2017-04-19 | 2020-12-08 | 赛灵思公司 | Artificial neural network processing device |
CN107679620B (en) * | 2017-04-19 | 2020-05-26 | 赛灵思公司 | Artificial neural network processing device |
US10514719B2 (en) * | 2017-06-27 | 2019-12-24 | Biosense Webster (Israel) Ltd. | System and method for synchronization among clocks in a wireless system |
US10445451B2 (en) | 2017-07-01 | 2019-10-15 | Intel Corporation | Processors, methods, and systems for a configurable spatial accelerator with performance, correctness, and power reduction features |
US10387319B2 (en) | 2017-07-01 | 2019-08-20 | Intel Corporation | Processors, methods, and systems for a configurable spatial accelerator with memory system performance, power reduction, and atomics support features |
US10467183B2 (en) | 2017-07-01 | 2019-11-05 | Intel Corporation | Processors and methods for pipelined runtime services in a spatial array |
US10445234B2 (en) | 2017-07-01 | 2019-10-15 | Intel Corporation | Processors, methods, and systems for a configurable spatial accelerator with transactional and replay features |
US10515049B1 (en) | 2017-07-01 | 2019-12-24 | Intel Corporation | Memory circuits and methods for distributed memory hazard detection and error recovery |
US10515046B2 (en) | 2017-07-01 | 2019-12-24 | Intel Corporation | Processors, methods, and systems with a configurable spatial accelerator |
US10469397B2 (en) | 2017-07-01 | 2019-11-05 | Intel Corporation | Processors and methods with configurable network-based dataflow operator circuits |
US10496574B2 (en) | 2017-09-28 | 2019-12-03 | Intel Corporation | Processors, methods, and systems for a memory fence in a configurable spatial accelerator |
US11086816B2 (en) | 2017-09-28 | 2021-08-10 | Intel Corporation | Processors, methods, and systems for debugging a configurable spatial accelerator |
US10445098B2 (en) | 2017-09-30 | 2019-10-15 | Intel Corporation | Processors and methods for privileged configuration in a spatial array |
US10380063B2 (en) | 2017-09-30 | 2019-08-13 | Intel Corporation | Processors, methods, and systems with a configurable spatial accelerator having a sequencer dataflow operator |
CN107831824B (en) * | 2017-10-16 | 2021-04-06 | 北京比特大陆科技有限公司 | Clock signal transmission method and device, multiplexing chip and electronic equipment |
GB2568087B (en) * | 2017-11-03 | 2022-07-20 | Imagination Tech Ltd | Activation functions for deep neural networks |
US10565134B2 (en) | 2017-12-30 | 2020-02-18 | Intel Corporation | Apparatus, methods, and systems for multicast in a configurable spatial accelerator |
US10417175B2 (en) | 2017-12-30 | 2019-09-17 | Intel Corporation | Apparatus, methods, and systems for memory consistency in a configurable spatial accelerator |
US10445250B2 (en) | 2017-12-30 | 2019-10-15 | Intel Corporation | Apparatus, methods, and systems with a configurable spatial accelerator |
JP2019153909A (en) * | 2018-03-02 | 2019-09-12 | 株式会社リコー | Semiconductor integrated circuit and clock supply method |
US11307873B2 (en) | 2018-04-03 | 2022-04-19 | Intel Corporation | Apparatus, methods, and systems for unstructured data flow in a configurable spatial accelerator with predicate propagation and merging |
US10564980B2 (en) | 2018-04-03 | 2020-02-18 | Intel Corporation | Apparatus, methods, and systems for conditional queues in a configurable spatial accelerator |
US10853073B2 (en) | 2018-06-30 | 2020-12-01 | Intel Corporation | Apparatuses, methods, and systems for conditional operations in a configurable spatial accelerator |
US10891240B2 (en) | 2018-06-30 | 2021-01-12 | Intel Corporation | Apparatus, methods, and systems for low latency communication in a configurable spatial accelerator |
US11200186B2 (en) | 2018-06-30 | 2021-12-14 | Intel Corporation | Apparatuses, methods, and systems for operations in a configurable spatial accelerator |
US10459866B1 (en) | 2018-06-30 | 2019-10-29 | Intel Corporation | Apparatuses, methods, and systems for integrated control and data processing in a configurable spatial accelerator |
US11176281B2 (en) * | 2018-10-08 | 2021-11-16 | Micron Technology, Inc. | Security managers and methods for implementing security protocols in a reconfigurable fabric |
US10678724B1 (en) | 2018-12-29 | 2020-06-09 | Intel Corporation | Apparatuses, methods, and systems for in-network storage in a configurable spatial accelerator |
US11029927B2 (en) | 2019-03-30 | 2021-06-08 | Intel Corporation | Methods and apparatus to detect and annotate backedges in a dataflow graph |
US10965536B2 (en) | 2019-03-30 | 2021-03-30 | Intel Corporation | Methods and apparatus to insert buffers in a dataflow graph |
US10817291B2 (en) | 2019-03-30 | 2020-10-27 | Intel Corporation | Apparatuses, methods, and systems for swizzle operations in a configurable spatial accelerator |
US10915471B2 (en) | 2019-03-30 | 2021-02-09 | Intel Corporation | Apparatuses, methods, and systems for memory interface circuit allocation in a configurable spatial accelerator |
US11288244B2 (en) * | 2019-06-10 | 2022-03-29 | Akamai Technologies, Inc. | Tree deduplication |
US11037050B2 (en) | 2019-06-29 | 2021-06-15 | Intel Corporation | Apparatuses, methods, and systems for memory interface circuit arbitration in a configurable spatial accelerator |
US11907713B2 (en) | 2019-12-28 | 2024-02-20 | Intel Corporation | Apparatuses, methods, and systems for fused operations using sign modification in a processing element of a configurable spatial accelerator |
US11320885B2 (en) | 2020-05-26 | 2022-05-03 | Dell Products L.P. | Wide range power mechanism for over-speed memory design |
CN114528246A (en) * | 2020-11-23 | 2022-05-24 | 深圳比特微电子科技有限公司 | Operation core, calculation chip and encrypted currency mining machine |
US11768714B2 (en) | 2021-06-22 | 2023-09-26 | Microsoft Technology Licensing, Llc | On-chip hardware semaphore array supporting multiple conditionals |
US11797480B2 (en) * | 2021-12-31 | 2023-10-24 | Tsx Inc. | Storage of order books with persistent data structures |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5659781A (en) * | 1994-06-29 | 1997-08-19 | Larson; Noble G. | Bidirectional systolic ring network |
US5828858A (en) * | 1996-09-16 | 1998-10-27 | Virginia Tech Intellectual Properties, Inc. | Worm-hole run-time reconfigurable processor field programmable gate array (FPGA) |
US5923660A (en) * | 1996-01-31 | 1999-07-13 | Galileo Technologies Ltd. | Switching ethernet controller |
US6009488A (en) * | 1997-11-07 | 1999-12-28 | Microlinc, Llc | Computer having packet-based interconnect channel |
US6208619B1 (en) * | 1997-03-27 | 2001-03-27 | Kabushiki Kaisha Toshiba | Packet data flow control method and device |
US6366584B1 (en) * | 1999-02-06 | 2002-04-02 | Triton Network Systems, Inc. | Commercial network based on point to point radios |
Family Cites Families (151)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
IT1061921B (en) * | 1976-06-23 | 1983-04-30 | Lolli & C Spa | IMPROVEMENT IN DIFFUSERS FOR AIR CONDITIONING SYSTEMS |
USD259208S (en) * | 1979-04-23 | 1981-05-12 | Mccullough John R | Roof vent |
GB8401805D0 (en) * | 1984-01-24 | 1984-02-29 | Int Computers Ltd | Data processing apparatus |
JPS61156338A (en) * | 1984-12-27 | 1986-07-16 | Toshiba Corp | Multiprocessor system |
US4641571A (en) * | 1985-07-15 | 1987-02-10 | Enamel Products & Plating Co. | Turbo fan vent |
US4850027A (en) * | 1985-07-26 | 1989-07-18 | International Business Machines Corporation | Configurable parallel pipeline image processing system |
JP2564805B2 (en) * | 1985-08-08 | 1996-12-18 | 日本電気株式会社 | Information processing device |
JPH0771111B2 (en) * | 1985-09-13 | 1995-07-31 | 日本電気株式会社 | Packet exchange processor |
US5021947A (en) * | 1986-03-31 | 1991-06-04 | Hughes Aircraft Company | Data-flow multiprocessor architecture with three dimensional multistage interconnection network for efficient signal and data processing |
GB8618943D0 (en) * | 1986-08-02 | 1986-09-10 | Int Computers Ltd | Data processing apparatus |
DE3751412T2 (en) * | 1986-09-02 | 1995-12-14 | Fuji Photo Film Co Ltd | Method and device for image processing with gradation correction of the image signal. |
US5418970A (en) * | 1986-12-17 | 1995-05-23 | Massachusetts Institute Of Technology | Parallel processing system with processor array with processing elements addressing associated memories using host supplied address value and base register content |
GB8723203D0 (en) * | 1987-10-02 | 1987-11-04 | Crosfield Electronics Ltd | Interactive image modification |
DE3742941A1 (en) * | 1987-12-18 | 1989-07-06 | Standard Elektrik Lorenz Ag | PACKAGE BROKERS |
JP2559262B2 (en) * | 1988-10-13 | 1996-12-04 | 富士写真フイルム株式会社 | Magnetic disk |
JPH02105910A (en) * | 1988-10-14 | 1990-04-18 | Hitachi Ltd | Logic integrated circuit |
AU620994B2 (en) * | 1989-07-12 | 1992-02-27 | Digital Equipment Corporation | Compressed prefix matching database searching |
US5212777A (en) * | 1989-11-17 | 1993-05-18 | Texas Instruments Incorporated | Multi-processor reconfigurable in single instruction multiple data (SIMD) and multiple instruction multiple data (MIMD) modes and method of operation |
US5218709A (en) * | 1989-12-28 | 1993-06-08 | The United States Of America As Represented By The Administrator Of The National Aeronautics And Space Administration | Special purpose parallel computer architecture for real-time control and simulation in robotic applications |
US5426610A (en) * | 1990-03-01 | 1995-06-20 | Texas Instruments Incorporated | Storage circuitry using sense amplifier with temporary pause for voltage supply isolation |
JPH04219859A (en) | 1990-03-12 | 1992-08-10 | Hewlett Packard Co <Hp> | Harware distributor which distributes series-instruction-stream data to parallel processors |
US5327159A (en) * | 1990-06-27 | 1994-07-05 | Texas Instruments Incorporated | Packed bus selection of multiple pixel depths in palette devices, systems and methods |
US5121198A (en) * | 1990-06-28 | 1992-06-09 | Eastman Kodak Company | Method of setting the contrast of a color video picture in a computer controlled photographic film analyzing system |
US5963746A (en) | 1990-11-13 | 1999-10-05 | International Business Machines Corporation | Fully distributed processing memory element |
US5752067A (en) | 1990-11-13 | 1998-05-12 | International Business Machines Corporation | Fully scalable parallel processing system having asynchronous SIMD processing |
US5590345A (en) | 1990-11-13 | 1996-12-31 | International Business Machines Corporation | Advanced parallel array processor(APAP) |
US5765011A (en) | 1990-11-13 | 1998-06-09 | International Business Machines Corporation | Parallel processing system having a synchronous SIMD processing with processing elements emulating SIMD operation using individual instruction streams |
US5625836A (en) | 1990-11-13 | 1997-04-29 | International Business Machines Corporation | SIMD/MIMD processing memory element (PME) |
US5367643A (en) * | 1991-02-06 | 1994-11-22 | International Business Machines Corporation | Generic high bandwidth adapter having data packet memory configured in three level hierarchy for temporary storage of variable length data packets |
US5285528A (en) * | 1991-02-22 | 1994-02-08 | International Business Machines Corporation | Data structures and algorithms for managing lock states of addressable element ranges |
WO1992015960A1 (en) * | 1991-03-05 | 1992-09-17 | Hajime Seki | Electronic computer system and processor elements used for this system |
US5313582A (en) * | 1991-04-30 | 1994-05-17 | Standard Microsystems Corporation | Method and apparatus for buffering data within stations of a communication network |
US5224100A (en) * | 1991-05-09 | 1993-06-29 | David Sarnoff Research Center, Inc. | Routing technique for a hierarchical interprocessor-communication network between massively-parallel processors |
EP0593609A1 (en) * | 1991-07-01 | 1994-04-27 | Telstra Corporation Limited | High speed switching architecture |
US5404550A (en) * | 1991-07-25 | 1995-04-04 | Tandem Computers Incorporated | Method and apparatus for executing tasks by following a linked list of memory packets |
US5155484A (en) * | 1991-09-13 | 1992-10-13 | Salient Software, Inc. | Fast data compressor with direct lookup table indexing into history buffer |
JP2750968B2 (en) * | 1991-11-18 | 1998-05-18 | シャープ株式会社 | Data driven information processor |
US5307381A (en) * | 1991-12-27 | 1994-04-26 | Intel Corporation | Skew-free clock signal distribution network in a microprocessor |
US5603028A (en) * | 1992-03-02 | 1997-02-11 | Mitsubishi Denki Kabushiki Kaisha | Method and apparatus for data distribution |
JPH0696035A (en) | 1992-09-16 | 1994-04-08 | Sanyo Electric Co Ltd | Processing element and parallel processing computer using the same |
EP0601715A1 (en) * | 1992-12-11 | 1994-06-15 | National Semiconductor Corporation | Bus of CPU core optimized for accessing on-chip memory devices |
US5579223A (en) * | 1992-12-24 | 1996-11-26 | Microsoft Corporation | Method and system for incorporating modifications made to a computer program into a translated version of the computer program |
GB2277235B (en) * | 1993-04-14 | 1998-01-07 | Plessey Telecomm | Apparatus and method for the digital transmission of data |
US5640551A (en) * | 1993-04-14 | 1997-06-17 | Apple Computer, Inc. | Efficient high speed trie search process |
US5420858A (en) * | 1993-05-05 | 1995-05-30 | Synoptics Communications, Inc. | Method and apparatus for communications from a non-ATM communication medium to an ATM communication medium |
JP2629568B2 (en) * | 1993-07-30 | 1997-07-09 | 日本電気株式会社 | ATM cell switching system |
US5918061A (en) * | 1993-12-29 | 1999-06-29 | Intel Corporation | Enhanced power managing unit (PMU) in a multiprocessor chip |
US5524223A (en) | 1994-01-31 | 1996-06-04 | Motorola, Inc. | Instruction accelerator for processing loop instructions with address generator using multiple stored increment values |
US5423003A (en) * | 1994-03-03 | 1995-06-06 | Geonet Limited L.P. | System for managing network computer applications |
DE69428186T2 (en) * | 1994-04-28 | 2002-03-28 | Hewlett Packard Co | Multicast device |
EP0681236B1 (en) * | 1994-05-05 | 2000-11-22 | Conexant Systems, Inc. | Space vector data path |
BR9506208A (en) | 1994-05-06 | 1996-04-23 | Motorola Inc | Communication system and process for routing calls to a terminal |
US5463732A (en) * | 1994-05-13 | 1995-10-31 | David Sarnoff Research Center, Inc. | Method and apparatus for accessing a distributed data buffer |
US5682480A (en) * | 1994-08-15 | 1997-10-28 | Hitachi, Ltd. | Parallel computer system for performing barrier synchronization by transferring the synchronization packet through a path which bypasses the packet buffer in response to an interrupt |
US5949781A (en) * | 1994-08-31 | 1999-09-07 | Brooktree Corporation | Controller for ATM segmentation and reassembly |
US5586119A (en) * | 1994-08-31 | 1996-12-17 | Motorola, Inc. | Method and apparatus for packet alignment in a communication system |
US5754584A (en) * | 1994-09-09 | 1998-05-19 | Omnipoint Corporation | Non-coherent spread-spectrum continuous-phase modulation communication system |
JPH10508714A (en) * | 1994-11-07 | 1998-08-25 | テンプル ユニヴァーシティ − オブ ザ カモン ウェルス システム オブ ハイヤー エデュケイション | Multicomputer system and method |
US5651099A (en) * | 1995-01-26 | 1997-07-22 | Hewlett-Packard Company | Use of a genetic algorithm to optimize memory space |
JPH08249306A (en) * | 1995-03-09 | 1996-09-27 | Sharp Corp | Data driven type information processor |
US5634068A (en) * | 1995-03-31 | 1997-05-27 | Sun Microsystems, Inc. | Packet switched cache coherent multiprocessor system |
US5835095A (en) * | 1995-05-08 | 1998-11-10 | Intergraph Corporation | Visible line processor |
JP3515263B2 (en) * | 1995-05-18 | 2004-04-05 | 株式会社東芝 | Router device, data communication network system, node device, data transfer method, and network connection method |
US5689677A (en) | 1995-06-05 | 1997-11-18 | Macmillan; David C. | Circuit for enhancing performance of a computer for personal use |
US6147996A (en) * | 1995-08-04 | 2000-11-14 | Cisco Technology, Inc. | Pipelined multiple issue packet switch |
US6115802A (en) * | 1995-10-13 | 2000-09-05 | Sun Mircrosystems, Inc. | Efficient hash table for use in multi-threaded environments |
US5612956A (en) * | 1995-12-15 | 1997-03-18 | General Instrument Corporation Of Delaware | Reformatting of variable rate data for fixed rate communication |
US5822606A (en) * | 1996-01-11 | 1998-10-13 | Morton; Steven G. | DSP having a plurality of like processors controlled in parallel by an instruction word, and a control processor also controlled by the instruction word |
EP0879544B1 (en) * | 1996-02-06 | 2003-05-02 | International Business Machines Corporation | Parallel on-the-fly processing of fixed length cells |
US5781549A (en) * | 1996-02-23 | 1998-07-14 | Allied Telesyn International Corp. | Method and apparatus for switching data packets in a data network |
US6035193A (en) | 1996-06-28 | 2000-03-07 | At&T Wireless Services Inc. | Telephone system having land-line-supported private base station switchable into cellular network |
US6101176A (en) | 1996-07-24 | 2000-08-08 | Nokia Mobile Phones | Method and apparatus for operating an indoor CDMA telecommunications system |
US6088355A (en) * | 1996-10-11 | 2000-07-11 | C-Cube Microsystems, Inc. | Processing system with pointer-based ATM segmentation and reassembly |
US6791947B2 (en) * | 1996-12-16 | 2004-09-14 | Juniper Networks | In-line packet processing |
JP3000961B2 (en) * | 1997-06-06 | 2000-01-17 | 日本電気株式会社 | Semiconductor integrated circuit |
US5969559A (en) * | 1997-06-09 | 1999-10-19 | Schwartz; David M. | Method and apparatus for using a power grid for clock distribution in semiconductor integrated circuits |
US5828870A (en) * | 1997-06-30 | 1998-10-27 | Adaptec, Inc. | Method and apparatus for controlling clock skew in an integrated circuit |
JP3469046B2 (en) * | 1997-07-08 | 2003-11-25 | 株式会社東芝 | Functional block and semiconductor integrated circuit device |
US6047304A (en) * | 1997-07-29 | 2000-04-04 | Nortel Networks Corporation | Method and apparatus for performing lane arithmetic to perform network processing |
WO1999014893A2 (en) * | 1997-09-17 | 1999-03-25 | Sony Electronics Inc. | Multi-port bridge with triplet architecture and periodical update of address look-up table |
JPH11194850A (en) * | 1997-09-19 | 1999-07-21 | Lsi Logic Corp | Clock distribution network for integrated circuit, and clock distribution method |
US5872993A (en) * | 1997-12-01 | 1999-02-16 | Advanced Micro Devices, Inc. | Communications system with multiple, simultaneous accesses to a memory |
US6081523A (en) * | 1997-12-05 | 2000-06-27 | Advanced Micro Devices, Inc. | Arrangement for transmitting packet data segments from a media access controller across multiple physical links |
US6219796B1 (en) * | 1997-12-23 | 2001-04-17 | Texas Instruments Incorporated | Power reduction for processors by software control of functional units |
US6301603B1 (en) * | 1998-02-17 | 2001-10-09 | Euphonics Incorporated | Scalable audio processing on a heterogeneous processor array |
JP3490286B2 (en) | 1998-03-13 | 2004-01-26 | 株式会社東芝 | Router device and frame transfer method |
JPH11272629A (en) * | 1998-03-19 | 1999-10-08 | Hitachi Ltd | Data processor |
US6052769A (en) * | 1998-03-31 | 2000-04-18 | Intel Corporation | Method and apparatus for moving select non-contiguous bytes of packed data in a single instruction |
US6275508B1 (en) * | 1998-04-21 | 2001-08-14 | Nexabit Networks, Llc | Method of and system for processing datagram headers for high speed computer network interfaces at low clock speeds, utilizing scalable algorithms for performing such network header adaptation (SAPNA) |
WO1999057858A1 (en) * | 1998-05-07 | 1999-11-11 | Cabletron Systems, Inc. | Multiple priority buffering in a computer network |
US6131102A (en) * | 1998-06-15 | 2000-10-10 | Microsoft Corporation | Method and system for cost computation of spelling suggestions and automatic replacement |
US6305001B1 (en) * | 1998-06-18 | 2001-10-16 | Lsi Logic Corporation | Clock distribution network planning and method therefor |
EP0991231B1 (en) | 1998-09-10 | 2009-07-01 | International Business Machines Corporation | Packet switch adapter for variable length packets |
US6393026B1 (en) * | 1998-09-17 | 2002-05-21 | Nortel Networks Limited | Data packet processing system and method for a router |
EP0992895A1 (en) * | 1998-10-06 | 2000-04-12 | Texas Instruments Inc. | Hardware accelerator for data processing systems |
JP3504510B2 (en) * | 1998-10-12 | 2004-03-08 | 日本電信電話株式会社 | Packet switch |
JP3866425B2 (en) * | 1998-11-12 | 2007-01-10 | 株式会社日立コミュニケーションテクノロジー | Packet switch |
US6272522B1 (en) * | 1998-11-17 | 2001-08-07 | Sun Microsystems, Incorporated | Computer data packet switching and load balancing system using a general-purpose multiprocessor architecture |
US6256421B1 (en) * | 1998-12-07 | 2001-07-03 | Xerox Corporation | Method and apparatus for simulating JPEG compression |
JP3704438B2 (en) * | 1998-12-09 | 2005-10-12 | 株式会社日立製作所 | Variable-length packet communication device |
US6338078B1 (en) * | 1998-12-17 | 2002-01-08 | International Business Machines Corporation | System and method for sequencing packets for multiprocessor parallelization in a computer network system |
JP3587076B2 (en) | 1999-03-05 | 2004-11-10 | 松下電器産業株式会社 | Packet receiver |
US6605001B1 (en) * | 1999-04-23 | 2003-08-12 | Elia Rocco Tarantino | Dice game in which categories are filled and scores awarded |
GB2352536A (en) * | 1999-07-21 | 2001-01-31 | Element 14 Ltd | Conditional instruction execution |
GB2352595B (en) * | 1999-07-27 | 2003-10-01 | Sgs Thomson Microelectronics | Data processing device |
USD428484S (en) * | 1999-08-03 | 2000-07-18 | Zirk Todd A | Copper roof vent cover |
US6631422B1 (en) | 1999-08-26 | 2003-10-07 | International Business Machines Corporation | Network adapter utilizing a hashing function for distributing packets to multiple processors for parallel processing |
US6404752B1 (en) * | 1999-08-27 | 2002-06-11 | International Business Machines Corporation | Network switch using network processor and methods |
US6631419B1 (en) * | 1999-09-22 | 2003-10-07 | Juniper Networks, Inc. | Method and apparatus for high-speed longest prefix and masked prefix table search |
US6963572B1 (en) * | 1999-10-22 | 2005-11-08 | Alcatel Canada Inc. | Method and apparatus for segmentation and reassembly of data packets in a communication switch |
AU5075301A (en) * | 1999-10-26 | 2001-07-03 | Arthur D. Little, Inc. | Mimd arrangement of simd machines |
JP2001177574A (en) * | 1999-12-20 | 2001-06-29 | Kddi Corp | Transmission controller in packet exchange network |
GB2357601B (en) * | 1999-12-23 | 2004-03-31 | Ibm | Remote power control |
US6661794B1 (en) | 1999-12-29 | 2003-12-09 | Intel Corporation | Method and apparatus for gigabit packet assignment for multithreaded packet processing |
ATE280411T1 (en) * | 2000-01-07 | 2004-11-15 | Ibm | METHOD AND SYSTEM FOR FRAMEWORK AND PROTOCOL CLASSIFICATION |
US20030093613A1 (en) * | 2000-01-14 | 2003-05-15 | David Sherman | Compressed ternary mask system and method |
JP2001202345A (en) * | 2000-01-21 | 2001-07-27 | Hitachi Ltd | Parallel processor |
ATE319249T1 (en) * | 2000-01-27 | 2006-03-15 | Ibm | METHOD AND DEVICE FOR CLASSIFICATION OF DATA PACKETS |
US6704794B1 (en) * | 2000-03-03 | 2004-03-09 | Nokia Intelligent Edge Routers Inc. | Cell reassembly for packet based networks |
US20020107903A1 (en) * | 2000-11-07 | 2002-08-08 | Richter Roger K. | Methods and systems for the order serialization of information in a network processing environment |
JP2001251349A (en) * | 2000-03-06 | 2001-09-14 | Fujitsu Ltd | Packet processor |
US7139282B1 (en) * | 2000-03-24 | 2006-11-21 | Juniper Networks, Inc. | Bandwidth division for packet processing |
US7089240B2 (en) * | 2000-04-06 | 2006-08-08 | International Business Machines Corporation | Longest prefix match lookup using hash function |
US7107265B1 (en) * | 2000-04-06 | 2006-09-12 | International Business Machines Corporation | Software management tree implementation for a network processor |
US6718326B2 (en) * | 2000-08-17 | 2004-04-06 | Nippon Telegraph And Telephone Corporation | Packet classification search device and method |
DE10059026A1 (en) | 2000-11-28 | 2002-06-13 | Infineon Technologies Ag | Unit for the distribution and processing of data packets |
GB2370381B (en) * | 2000-12-19 | 2003-12-24 | Picochip Designs Ltd | Processor architecture |
USD453960S1 (en) * | 2001-01-30 | 2002-02-26 | Molded Products Company | Shroud for a fan assembly |
US6832261B1 (en) | 2001-02-04 | 2004-12-14 | Cisco Technology, Inc. | Method and apparatus for distributed resequencing and reassembly of subdivided packets |
GB2407673B (en) | 2001-02-14 | 2005-08-24 | Clearspeed Technology Plc | Lookup engine |
US7856543B2 (en) | 2001-02-14 | 2010-12-21 | Rambus Inc. | Data processing architectures for packet handling wherein batches of data packets of unpredictable size are distributed across processing elements arranged in a SIMD array operable to process different respective packet protocols at once while executing a single common instruction stream |
JP4475835B2 (en) * | 2001-03-05 | 2010-06-09 | 富士通株式会社 | Input line interface device and packet communication device |
USD471971S1 (en) * | 2001-03-20 | 2003-03-18 | Flettner Ventilator Limited | Ventilation cover |
CA97495S (en) * | 2001-03-20 | 2003-05-07 | Flettner Ventilator Ltd | Rotor |
US6687715B2 (en) * | 2001-06-28 | 2004-02-03 | Intel Corporation | Parallel lookups that keep order |
US6922716B2 (en) | 2001-07-13 | 2005-07-26 | Motorola, Inc. | Method and apparatus for vector processing |
US7257590B2 (en) * | 2001-08-29 | 2007-08-14 | Nokia Corporation | Method and system for classifying binary strings |
US7283538B2 (en) * | 2001-10-12 | 2007-10-16 | Vormetric, Inc. | Load balanced scalable network gateway processor architecture |
US7317730B1 (en) * | 2001-10-13 | 2008-01-08 | Greenfield Networks, Inc. | Queueing architecture and load balancing for parallel packet processing in communication networks |
US6941446B2 (en) | 2002-01-21 | 2005-09-06 | Analog Devices, Inc. | Single instruction multiple data array cell |
US7382782B1 (en) | 2002-04-12 | 2008-06-03 | Juniper Networks, Inc. | Packet spraying for load balancing across multiple packet processors |
US20030235194A1 (en) * | 2002-06-04 | 2003-12-25 | Mike Morrison | Network processor with multiple multi-threaded packet-type specific engines |
US7200137B2 (en) * | 2002-07-29 | 2007-04-03 | Freescale Semiconductor, Inc. | On chip network that maximizes interconnect utilization between processing elements |
US8015567B2 (en) | 2002-10-08 | 2011-09-06 | Netlogic Microsystems, Inc. | Advanced processor with mechanism for packet distribution at high line rate |
GB0226249D0 (en) * | 2002-11-11 | 2002-12-18 | Clearspeed Technology Ltd | Traffic handling system |
US7656799B2 (en) | 2003-07-29 | 2010-02-02 | Citrix Systems, Inc. | Flow control system architecture |
US7620050B2 (en) | 2004-09-10 | 2009-11-17 | Canon Kabushiki Kaisha | Communication control device and communication control method |
US7787454B1 (en) | 2007-10-31 | 2010-08-31 | Gigamon Llc. | Creating and/or managing meta-data for data storage devices using a packet switch appliance |
JP5231926B2 (en) | 2008-10-06 | 2013-07-10 | キヤノン株式会社 | Information processing apparatus, control method therefor, and computer program |
US8493979B2 (en) * | 2008-12-30 | 2013-07-23 | Intel Corporation | Single instruction processing of network packets |
US8014295B2 (en) | 2009-07-14 | 2011-09-06 | Ixia | Parallel packet processor with session active checker |
-
2002
- 2002-02-14 US US10/073,948 patent/US7856543B2/en not_active Expired - Fee Related
- 2002-02-14 JP JP2002564715A patent/JP2004524617A/en active Pending
- 2002-02-14 GB GB0321186A patent/GB2390506B/en not_active Expired - Fee Related
- 2002-02-14 US US10/468,167 patent/US20040114609A1/en not_active Abandoned
- 2002-02-14 US US10/074,022 patent/US20020159466A1/en not_active Abandoned
- 2002-02-14 CN CNA02808148XA patent/CN1613041A/en active Pending
- 2002-02-14 AU AU2002233500A patent/AU2002233500A1/en not_active Abandoned
- 2002-02-14 US US10/468,168 patent/US7290162B2/en not_active Expired - Fee Related
- 2002-02-14 US US10/074,019 patent/US20020161926A1/en not_active Abandoned
- 2002-02-14 GB GB0319801A patent/GB2389689B/en not_active Expired - Fee Related
- 2002-02-14 CN CNB028081242A patent/CN100367730C/en not_active Expired - Fee Related
- 2002-02-14 GB GB0203632A patent/GB2377519B/en not_active Expired - Fee Related
- 2002-02-14 GB GB0203634A patent/GB2374443B/en not_active Expired - Fee Related
- 2002-02-14 JP JP2002564890A patent/JP2004525449A/en active Pending
- 2002-02-14 WO PCT/GB2002/000662 patent/WO2002065700A2/en active Application Filing
- 2002-02-14 GB GB0203633A patent/GB2374442B/en not_active Expired - Fee Related
- 2002-02-14 WO PCT/GB2002/000668 patent/WO2002065259A1/en active Application Filing
-
2005
- 2005-06-14 US US11/151,271 patent/US8200686B2/en not_active Expired - Fee Related
- 2005-06-14 US US11/151,292 patent/US20050242976A1/en not_active Abandoned
-
2007
- 2007-05-23 US US11/752,299 patent/US7818541B2/en not_active Expired - Fee Related
- 2007-05-23 US US11/752,300 patent/US7917727B2/en not_active Expired - Fee Related
-
2010
- 2010-12-10 US US12/965,673 patent/US8127112B2/en not_active Expired - Fee Related
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5659781A (en) * | 1994-06-29 | 1997-08-19 | Larson; Noble G. | Bidirectional systolic ring network |
US5923660A (en) * | 1996-01-31 | 1999-07-13 | Galileo Technologies Ltd. | Switching ethernet controller |
US5828858A (en) * | 1996-09-16 | 1998-10-27 | Virginia Tech Intellectual Properties, Inc. | Worm-hole run-time reconfigurable processor field programmable gate array (FPGA) |
US6208619B1 (en) * | 1997-03-27 | 2001-03-27 | Kabushiki Kaisha Toshiba | Packet data flow control method and device |
US6009488A (en) * | 1997-11-07 | 1999-12-28 | Microlinc, Llc | Computer having packet-based interconnect channel |
US6366584B1 (en) * | 1999-02-06 | 2002-04-02 | Triton Network Systems, Inc. | Commercial network based on point to point radios |
Cited By (63)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7055123B1 (en) * | 2001-12-31 | 2006-05-30 | Richard S. Norman | High-performance interconnect arrangement for an array of discrete functional modules |
US20040042496A1 (en) * | 2002-08-30 | 2004-03-04 | Intel Corporation | System including a segmentable, shared bus |
US7360007B2 (en) * | 2002-08-30 | 2008-04-15 | Intel Corporation | System including a segmentable, shared bus |
US20050216625A1 (en) * | 2004-03-09 | 2005-09-29 | Smith Zachary S | Suppressing production of bus transactions by a virtual-bus interface |
US20100241746A1 (en) * | 2005-02-23 | 2010-09-23 | International Business Machines Corporation | Method, Program and System for Efficiently Hashing Packet Keys into a Firewall Connection Table |
US8112547B2 (en) * | 2005-02-23 | 2012-02-07 | International Business Machines Corporation | Efficiently hashing packet keys into a firewall connection table |
US20080276116A1 (en) * | 2005-06-01 | 2008-11-06 | Tobias Bjerregaard | Method and an Apparatus for Providing Timing Signals to a Number of Circuits, an Integrated Circuit and a Node |
US8112654B2 (en) | 2005-06-01 | 2012-02-07 | Teklatech A/S | Method and an apparatus for providing timing signals to a number of circuits, and integrated circuit and a node |
US20070017694A1 (en) * | 2005-07-20 | 2007-01-25 | Tomoyuki Kubo | Wiring board and manufacturing method for wiring board |
US8885673B2 (en) | 2005-08-24 | 2014-11-11 | Intel Corporation | Interleaving data packets in a packet-based communication system |
US20070047584A1 (en) * | 2005-08-24 | 2007-03-01 | Spink Aaron T | Interleaving data packets in a packet-based communication system |
US8325768B2 (en) * | 2005-08-24 | 2012-12-04 | Intel Corporation | Interleaving data packets in a packet-based communication system |
US20090089029A1 (en) * | 2007-09-28 | 2009-04-02 | Rockwell Automation Technologies, Inc. | Enhanced execution speed to improve simulation performance |
US20100318339A1 (en) * | 2007-09-28 | 2010-12-16 | Rockwell Automation Technologies, Inc. | Simulation controls for model variablity and randomness |
US20090089234A1 (en) * | 2007-09-28 | 2009-04-02 | Rockwell Automation Technologies, Inc. | Automated code generation for simulators |
US8548777B2 (en) | 2007-09-28 | 2013-10-01 | Rockwell Automation Technologies, Inc. | Automated recommendations from simulation |
US7801710B2 (en) * | 2007-09-28 | 2010-09-21 | Rockwell Automation Technologies, Inc. | Simulation controls for model variability and randomness |
US20090089227A1 (en) * | 2007-09-28 | 2009-04-02 | Rockwell Automation Technologies, Inc. | Automated recommendations from simulation |
US8417506B2 (en) | 2007-09-28 | 2013-04-09 | Rockwell Automation Technologies, Inc. | Simulation controls for model variablity and randomness |
US20090089027A1 (en) * | 2007-09-28 | 2009-04-02 | Rockwell Automation Technologies, Inc. | Simulation controls for model variablity and randomness |
US20090089030A1 (en) * | 2007-09-28 | 2009-04-02 | Rockwell Automation Technologies, Inc. | Distributed simulation and synchronization |
US8069021B2 (en) | 2007-09-28 | 2011-11-29 | Rockwell Automation Technologies, Inc. | Distributed simulation and synchronization |
US20090089031A1 (en) * | 2007-09-28 | 2009-04-02 | Rockwell Automation Technologies, Inc. | Integrated simulation of controllers and devices |
US7995618B1 (en) * | 2007-10-01 | 2011-08-09 | Teklatech A/S | System and a method of transmitting data from a first device to a second device |
US20090268736A1 (en) * | 2008-04-24 | 2009-10-29 | Allison Brian D | Early header CRC in data response packets with variable gap count |
US20090268727A1 (en) * | 2008-04-24 | 2009-10-29 | Allison Brian D | Early header CRC in data response packets with variable gap count |
US20090271532A1 (en) * | 2008-04-24 | 2009-10-29 | Allison Brian D | Early header CRC in data response packets with variable gap count |
US8811430B2 (en) * | 2009-04-29 | 2014-08-19 | Intel Corporation | Packetized interface for coupling agents |
US9736276B2 (en) * | 2009-04-29 | 2017-08-15 | Intel Corporation | Packetized interface for coupling agents |
US20120176909A1 (en) * | 2009-04-29 | 2012-07-12 | Mahesh Wagh | Packetized Interface For Coupling Agents |
US8170062B2 (en) * | 2009-04-29 | 2012-05-01 | Intel Corporation | Packetized interface for coupling agents |
US20140307748A1 (en) * | 2009-04-29 | 2014-10-16 | Mahesh Wagh | Packetized Interface For Coupling Agents |
US20100278195A1 (en) * | 2009-04-29 | 2010-11-04 | Mahesh Wagh | Packetized Interface For Coupling Agents |
US8823495B2 (en) * | 2010-03-12 | 2014-09-02 | Zte Corporation | Sight spot guiding system and implementation method thereof |
US20130038427A1 (en) * | 2010-03-12 | 2013-02-14 | Zte Corporation | Sight Spot Guiding System and Implementation Method Thereof |
US20130229290A1 (en) * | 2012-03-01 | 2013-09-05 | Eaton Corporation | Instrument panel bus interface |
CN104144827A (en) * | 2012-03-01 | 2014-11-12 | 伊顿公司 | Instrument panel bus interface |
US20150012679A1 (en) * | 2013-07-03 | 2015-01-08 | Iii Holdings 2, Llc | Implementing remote transaction functionalities between data processing nodes of a switched interconnect fabric |
US10243882B1 (en) | 2017-04-13 | 2019-03-26 | Xilinx, Inc. | Network on chip switch interconnect |
US10673745B2 (en) | 2018-02-01 | 2020-06-02 | Xilinx, Inc. | End-to-end quality-of-service in a network-on-chip |
US10503690B2 (en) | 2018-02-23 | 2019-12-10 | Xilinx, Inc. | Programmable NOC compatible with multiple interface communication protocol |
US10621129B2 (en) | 2018-03-27 | 2020-04-14 | Xilinx, Inc. | Peripheral interconnect for configurable slave endpoint circuits |
US10505548B1 (en) | 2018-05-25 | 2019-12-10 | Xilinx, Inc. | Multi-chip structure having configurable network-on-chip |
US11263169B2 (en) | 2018-07-20 | 2022-03-01 | Xilinx, Inc. | Configurable network-on-chip for a programmable device |
US10838908B2 (en) | 2018-07-20 | 2020-11-17 | Xilinx, Inc. | Configurable network-on-chip for a programmable device |
US10824505B1 (en) | 2018-08-21 | 2020-11-03 | Xilinx, Inc. | ECC proxy extension and byte organization for multi-master systems |
US10963460B2 (en) | 2018-12-06 | 2021-03-30 | Xilinx, Inc. | Integrated circuits and methods to accelerate data queries |
US20200250281A1 (en) * | 2019-02-05 | 2020-08-06 | Arm Limited | Integrated circuit design and fabrication |
US10796040B2 (en) * | 2019-02-05 | 2020-10-06 | Arm Limited | Integrated circuit design and fabrication |
US10936486B1 (en) | 2019-02-21 | 2021-03-02 | Xilinx, Inc. | Address interleave support in a programmable device |
US10680615B1 (en) | 2019-03-27 | 2020-06-09 | Xilinx, Inc. | Circuit for and method of configuring and partially reconfiguring function blocks of an integrated circuit device |
US20220027308A1 (en) * | 2019-05-09 | 2022-01-27 | SambaNova Systems, Inc. | Control Barrier Network for Reconfigurable Data Processors |
US11580056B2 (en) * | 2019-05-09 | 2023-02-14 | SambaNova Systems, Inc. | Control barrier network for reconfigurable data processors |
US11188312B2 (en) | 2019-05-23 | 2021-11-30 | Xilinx, Inc. | Hardware-software design flow with high-level synthesis for heterogeneous and programmable devices |
US10891414B2 (en) | 2019-05-23 | 2021-01-12 | Xilinx, Inc. | Hardware-software design flow for heterogeneous and programmable devices |
US10891132B2 (en) | 2019-05-23 | 2021-01-12 | Xilinx, Inc. | Flow convergence during hardware-software design for heterogeneous and programmable devices |
US11301295B1 (en) | 2019-05-23 | 2022-04-12 | Xilinx, Inc. | Implementing an application specified as a data flow graph in an array of data processing engines |
US11645053B2 (en) | 2019-05-23 | 2023-05-09 | Xilinx, Inc. | Hardware-software design flow with high-level synthesis for heterogeneous and programmable devices |
US10977018B1 (en) | 2019-12-05 | 2021-04-13 | Xilinx, Inc. | Development environment for heterogeneous devices |
US11496418B1 (en) | 2020-08-25 | 2022-11-08 | Xilinx, Inc. | Packet-based and time-multiplexed network-on-chip |
US11336287B1 (en) | 2021-03-09 | 2022-05-17 | Xilinx, Inc. | Data processing engine array architecture with memory tiles |
US11520717B1 (en) | 2021-03-09 | 2022-12-06 | Xilinx, Inc. | Memory tiles in data processing engine array |
US11848670B2 (en) | 2022-04-15 | 2023-12-19 | Xilinx, Inc. | Multiple partitions in a data processing array |
Also Published As
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20040114609A1 (en) | Interconnection system | |
EP3776231B1 (en) | Procedures for implementing source based routing within an interconnect fabric on a system on chip | |
EP3400688B1 (en) | Massively parallel computer, accelerated computing clusters, and two dimensional router and interconnection network for field programmable gate arrays, and applications | |
US8599863B2 (en) | System and method for using a multi-protocol fabric module across a distributed server interconnect fabric | |
US9680770B2 (en) | System and method for using a multi-protocol fabric module across a distributed server interconnect fabric | |
US10608640B1 (en) | On-chip network in programmable integrated circuit | |
US10707875B1 (en) | Reconfigurable programmable integrated circuit with on-chip network | |
US11336287B1 (en) | Data processing engine array architecture with memory tiles | |
RU2283507C2 (en) | Method and device for configurable processor | |
US11730325B2 (en) | Dual mode interconnect | |
US20040100900A1 (en) | Message transfer system | |
US11520717B1 (en) | Memory tiles in data processing engine array | |
US8571016B2 (en) | Connection arrangement | |
US20070245044A1 (en) | System of interconnections for external functional blocks on a chip provided with a single configurable communication protocol | |
Nejad et al. | An FPGA bridge preserving traffic quality of service for on-chip network-based systems | |
US10990552B1 (en) | Streaming interconnect architecture for data processing engine array | |
Bianchini et al. | The Tera project: A hybrid queueing ATM switch architecture for LAN | |
Aust et al. | Real-time processor interconnection network for fpga-based multiprocessor system-on-chip (mpsoc) | |
Guruprasad et al. | An Efficient Bridge Architecture for NoC Based Systems on FPGA Platform | |
Rekha et al. | Analysis and Design of Novel Secured NoC for High Speed Communications | |
REDDY et al. | Design of Reconfigurable NoC Architecture for Low Area and Low Power Applications | |
Sha et al. | Design of Cloud Server Based on Godson Processors | |
Ferrer et al. | Quality of Service in NoC for Reconfigurable Space Applications | |
Khan et al. | Design and implementation of an interface control unit for rapid prototyping | |
Sharma et al. | Fpga cluster based high throughput architecture for cryptography and cryptanalysis |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: CLEARSPEED TECHNOLOGY LIMITED, UNITED KINGDOM Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:SWARBRICK, IAN;WINSER, PAUL;RYAN, STUART;REEL/FRAME:015007/0932;SIGNING DATES FROM 20030911 TO 20040116 |
|
AS | Assignment |
Owner name: CLEARSPEED SOLUTIONS LIMITED, UNITED KINGDOM Free format text: CHANGE OF NAME;ASSIGNOR:CLEARSPEED TECHNOLOGY LIMITED;REEL/FRAME:015317/0484 Effective date: 20040701 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |