« PreviousContinue »
United States Patent m
Cypher et al.
 METHOD OF PACKET ROUTING IN TORUS NETWORKS WITH TWO BUFFERS PER EDGE
 Inventors: Robert E. Cypher, Los Gatos; Luis Grayano, Mountain View, both of Calif.
 Assignee: International Business Machines Corporation, Armonk, N.Y.
 Appl. No.: 969,650
 Filed: Oct. 29,1992
 Int. CL« H04L 12/42
 U.S. CI 370/60; 390/85.12;
 Field of Search 370/13, 13.1, 17, 54,
370/60, 60.1, 61, 79, 85.9, 85.12, 85.15, 94.1,
 References Cited
U.S. PATENT DOCUMENTS
4,742,511 5/1988 Johnson 370/94
4,805,091 2/1989 Thiel et al 364/200
4,814,980 3/1989 Peterson et al 364/200
4,933,933 6/1990 Dally et al 370/60
4,984,235 1/1991 Hillis et al 370/60
5,008,815 4/1991 Hillis 364/200
5,105,424 4/1992 Flaig et al 370/94.1
5,151,900 9/1992 Snyder et al 370/94.3
5,157,692 10/1992 Horie et al 370/60 X
FOREIGN PATENT DOCUMENTS
460599 12/1991 European Pat. Off. . 2227341 7/1990 United Kingdom .
Y. Tamir and G. L. Frazier, "Dynamically—Allocated Multi-Queue Buffers for VLSI Communication Switches", IEEE Transactions on Computers, vol. 41, No. 6, pp. 725-737, Jun. 1992.
R. Cypher and L. Gravano, "Adaptive, Deadlock-Free Packet Routing in Torus Networks with Minimal Storage", IBM RJ 8571 (77350), Jan. 15, 1992. Dally, William J., "Deadlock-Free Message Routing in Multiprocessor Interconnection Networks", IEEE Transactions on Computers, vol. C-36, No. 5, May 1987.
[ii] Patent Number: 5,444,701  Date of Patent: Aug. 22, 1995
Dally, William J., "The torus routing chip", Distributed Computing (1986) 1:187-196.
Hayes, John P., "Computer Architecture and Organization", Second Edition (1988), pp. 645-664. Hwang, Kai, "Computer Architecture and Parallel Processing", (1984), Chapter Five, pp. 325-392. Konstantinidou, S., "Chaos Router: A Practical Application of Randomization in Network Routing", 2nd Annual ACM Symposium on Parallel Algorithms and Architectures, Jul. 2-6, 1990, Island of Crete, Greece. Tamir, Yuval, "Dynamically—Allocated Multi-Queue Buffers for VLSI Communication Switches", IEEE Transactions of Computers, vol. 41, No. 6 Jun. 1992. Tanenbaum, Andrew S., "Computer Networks", Second Edition (1988), Chapter 1, pp. 6-21. Van De Goor, A. J., "Computer Architecture and Design", (1989), Chapter 16, pp. 473-506.
Primary Examiner—Melvin Marcelo
Attorney, Agent, or Firm—Philip E. Blair; James C.
A method is for routing packets in parallel computers with torus interconnection networks of arbitrary size and dimension having a plurality of nodes, each of which contains at least 2 buffers per edge incident to the node. For each packet which is being routed or which is being injected into the communication network, a waiting set is specified which consists of those buffers to which the packet can be transferred. The packet can be transferred to any buffer in its waiting set which has enough storage available to hold the packet. This waiting set is specified by first defining a set of nodes to which the packet is allowed to move and then defining a candidate set of buffers within the defined set of nodes. Then, defining an ordering of the nodes across the network from smallest to largest. The buffers in each node are then classified into four classes. After the buffers in each node have been classified, a set of rules for placing into the waiting set those classes of candidate buffers to which the packet can move is provided such that the routing method is free of deadlock, livelock, and starvation.
32 Claims, 2 Drawing Sheets
U.S. Patent Aug. 22,1995 sheet 1 of 2 5,444,701
U.S. Patent Aug. 22, 1995 Sheet 2 of 2 5,444,701
for the packet routing algorithm to satisfy this basic METHOD OF PACKET ROUTING IN TORUS requirement, it must keep the interconnection network NETWORKS WITH TWO BUFFERS PER EDGE free from conditions known as deadlock, livelock, and
FIELD OF THE INVENTION 5 Deadlock is the condition of the interconnection
Broadly conceived, the present invention relates to network in which a set of buffers is completely occupacket routing on parallel computers with torus inter- Pied by messages all of which are only allowed to move connection networks of arbitrary size and dimension to other buffers within the set. As a result, none of the and, in particular, to methods for defining sets of buffers messages in this set of buffers can make progress and to which packets are allowed to move. 10 none of them will ever be delivered. Livelock is the
... Uictadv condition of the interconnection network in which a
tJACKUKUUiNLi msiUK* packet moves between buffers an unbounded number of
A typical computer has a central processing unit times without being delivered to its destination proces(CPU) which controls the processing of the computer. sor. Thus, a routing algorithm which is subject to liveA computer system may contain more than one proces- 15 lock may never deliver a packet to its destination prosor or node. Such a computer system would be able to cessor even though the packet continues to move process much more data in a faster timeframe in a paral- throughout the network amongst various nodes. Starvalel fashion than would a computer system having a tk>n is the condition of the interconnection network in single processor. which a packet waits for a buffer which becomes avail
A computer having multiple processors configured in 20 able an unbounded number of times without ever being a grid with all the processors working simultaneously in granted access to that buffer. Thus a routing algorithm parallel is known as a mesh configuration. In a mesh which is subject to starvation may fail to move a packet configuration, the processors are connected to their at ^ even though a buffer is available into which that neighboring processors in a mesh. Each node or proces- packet could be moved
sor would have 4 edges, or fewer if the processor is on 25 A ket ... ^ orithm should ^ exhibh ^ a boundary of the mesh, with each edge connected to veaotnaaBC characteristics. In order to provide good the next neighbor processor. If the edges of the mesh ^ormce| a routm should ^oi& sending
were wrapped around such that the processors on the kets ^ unnecessarily long routes. A routing al
boundary of the mesh were connected to the processors „ ;„ „ ;j _■ • 1 -r *i. ^ i
. . c i. * -j i cm gonthm is said to be minimal if the routing algorithm
on the opposite boundary of the mesh, a toroidal config- 30 <= s ©
uration would result. This is known as a torus network. e^h packet along the shortest possible route.
Parallel computers with mesh and torus interconnec- AA Packe rou^ algorithm should also be able to tion networks are known in the art because they are able adaP t0 n(*work congestion conditions. A packet routto support many scientific and image processing appli- mS ^Sonthm * *> be adaptive if it allows packets cations very efficiently and have advantages in terms of 35 t0 t0 the vanous trafflc conditions in the interconease of construction. A d-dimensional mesh or torus can nect}on network and to select an alternative path based be implemented with short wires in d-dimensions. In °n tne, congestion any given packet encounters enroute. addition, mesh and torus networks can be constructed Bv allowing packets to take alternate routes which using identical boards each of which requires only a av0ld congestion, adaptive routing algorithms can small number of pins for connections to other boards 40 greatly improve network communication performance, containing processor units. Because of this modularity a A* adaptive, minimal routing algorithm that allows large number of distributed memory parallel computers evefy packet to take all of its shortest routes to its destiutilize a mesh or torus interconnection network. nation node is said to be fully adaptive.
In terms of the differences between torus and mesh Packet routing algorithms can be further classified by configured networks, for given d-dimensional mesh and 45 the type of switching mode or routing that they utilize, torus computers of equal size, the torus computer has In store-and-forward routing, each packet is stored approximately half the diameter and twice the bisection completely in a node before being sent to the next node bandwidth of the mesh computer. Furthermore, torus along the path. In general, store-and-forward routing is networks are node symmetric, i.e., all nodes in the torus a simple technique which works well when the packets are identical and therefore no region of the torus is 50 are small in comparison with the channel widths. In particularly likely to suffer from congestion, which is contrast, wormhole routing breaks each packet into the condition when the interconnection network be- small pieces called flits. As soon as a flit has been recomes clogged with messages and begins to slow itself ceived by a node, the flit is sent to the next node in its down. In contrast, mesh networks are not node sym- path without waiting for the remaining flits of the metric and their lack of symmetry can cause certain 55 packet to arrive. This creates a worm of flits which regions of the mesh to suffer congestion. As a result, follow one another from node to node through the torus interconnection networks are expected to play an network towards their destination node. If the head of increasingly important role in future generations of this worm of flits encounters congestion the entire parallel computers. worm is prevented from making process. Another
The processors in a parallel computer communicate 60 switching mode which is similar to wormhole routing, with one another by sending or routing packets of data is known as virtual cut-through routing. In virtual cutacross the network to the other processors. These pack- through routing, each packet is sent as a worm of flits ets are sent through the interconnection network from which follow one another through the network with their source processors (nodes) to their destination each node buffering the entire worm inside the node nodes by a packet routing algorithm. A fundamental 65 whenever congestion occurs on the interconnection requirement of any packet routing algorithm is that it network in order to reduce traffic. This requires the use must at the very least guarantee that all messages will of internal buffers in each node which are set aside for eventually be delivered to their destinations. In order buffering packets that have encountered congestion.