WO2004090693A2

WO2004090693A2 - Methods and systems for determining network integrity and providing improved network availability

Info

Publication number: WO2004090693A2
Application number: PCT/US2004/009970
Authority: WO
Inventors: Jeffrey J. Fitzgerald
Original assignee: Cedar Point Communications, Inc.
Priority date: 2003-04-01
Filing date: 2004-04-01
Publication date: 2004-10-21
Also published as: US7733783B2; US20050018612A1; WO2004090693A3

Abstract

In one embodiment, the invention provides methods and systems for determining status of nodes and related communication channels to configure paths on a network so as to improve network efficiency.

Description

Methods and Systems for Determining Network Integrity and Providing Improved Network Availability

Related Applications

This application claims the benefit of U.S. Prov. Pat. App. No. 60/459548, filed on

April 1, 2003, entitled "Method to Improve Ethernet Network Availability", and U.S.

Pat. App. 10/727,118, filed on December 2, 2003, entitled "Improving Network

Availability", the entire contents of both of which are incorporated herein by reference.

Field of the Invention

The invention relates to computer network coimnunications and, in particular embodiments, to network management protocols.

Background of the Invention In prior art Wide Area Network (WAN) architecture, communications paths (or links) are point-to-point. In such networks, the nodes (or host computers) communicate with each other directly. Reliable WAN link up/link down status mechanisms are well known at both the physical layer and the data link layer (layers 1 and 2 of the Open Systems Interconnect [OSI] Reference Model). These status mechanisms allow link faults to be determined within about tens of milliseconds to about one or two seconds.

However, in Local Area Network (LAN) architectures status determination is not as readily available. In part, because LAN network protocols, such as Ethernet, are connectionless and support multiple accesses, several problems arise. By way of example, an Ethernet LAN can be partitioned into multiple subnetworks or segments. A given node (such as, but no limited to, a host computer, load balancing device, or router) on such a LAN is not aware of any such segmentation. If a node faults, there is not necessarily any notification (e.g. a "loss of carrier" signal) to other nodes on its segment or to other segments. Additionally, there is generally no "keep alive" or "link up" check mechanism to determine whether the link or links to a particular node are working or if the node is still "listening" or has left the segment. Accordingly, a convenient way of determining network integrity and readiness for LANs, such as, but not limited to Ethernet LANs is needed.

Summary of the Invention

The invention addresses the deficiencies in the prior art by, in one embodiment, providing a method and apparatus for improving local area network (LAN) availability by implementing a standards-based link up/link down status detection protocol on segment-to-segment communications paths. The status detection protocol employs, in one embodiment, industry-standard Logical Link Control (LLC) Type 1 "test frame," described in IEEE Standard 802.2, to provide Ethernet status test messages and return responses. According to a feature, the status detection protocol of the invention provides continuous status information. Such continuous status information enables rapid routing table updates in the LAN (or attached WAN), and thus avoids inefficiencies due to routing to or through disabled or unavailable (down) nodes. In one preferred implementation, the status detection protocol of the invention operates on top of an existing Ethernet protocol in layer 2. According to another feature, the status detection protocol provides multiple access capabilities (a "multiaccess" protocol) and is compatible with Ethernet protocol generally.

In one aspect, a method according to the invention, includes the steps of, periodically transmitting a test message over a plurality of communication links from a source node of a source network segment to a plurality of destination nodes, each of the plurality of destination nodes being in communication with a respective destination network segment; generating, for each of the plurality of destination nodes, a return message if the test message is received at the destination node; determining the status of each of the plurality of communication links in response to the return messages generated by the plurality of destination nodes; and providing the status of the plurality of communication links to each of the plurality of destination nodes that generates a return message. According to one configuration, the return messages include echo messages, generated in response to the test messages.

According to one embodiment, the test message includes a LLC Type 1 frame format message. However, any suitable, preferably compact, test message may be employed. In one configuration, the test message is transmitted at a rate of about once per second. Preferably, prior to transmitting the test message, the method of the invention detects an initial state of the network, for example, by observing the routing table at a source or destination node on which the method is operating. The source and/or destination nodes may be, for example, a router, load balancer, firewall, special- purpose device, host computer, or other like device.

According to one feature, the method of the invention operates concurrently on a plurality of the nodes in a network segment to be protected. In a related embodiment, the method of the invention operates concurrently on all or substantially all of the nodes in the network segment. Each node then performs its own self-discovery of adjacency and determines the status of adjacent nodes and links. This information is then used to update an adjacency status table at each node with adjacency information seen from the perspective of that node, hi an alternate embodiment, less than all of the nodes in the segment may utilize the method of the invention. However, preferably, more than one node uses the invention.

In one embodiment, status is determined by waiting a pre-determined period of time for a return acknowledgment message, such as an echo of the transmitted test frame. In an alternative embodiment, status is determined by detecting whether the source node receives at least a predetermined number of return messages from a respective destination node, in response to the source node transmitting a predetermined number of test messages. If the expected number of return messages is not received, the system determines that there is a fault with the communication link between the source node and destination node.

If the status of any node changes, as denoted by the failure to receive a return message from that node signifying either a node or a link failure, the sending node updates its local adjacency status table. The status changes may then be incorporated into the local RIB/routing table, which is then propagated to other routers on the network through standard mechanisms well-known in the art.

Because each router updates its adjacency status table each time the local message/response cycle is completed, reflecting the true state of all links, LAN efficiency is improved by avoiding routes through dead links or to unresponsive nodes. For example, a response wait period of approximately one second allows router table updates approximately every few seconds, instead of the 5 to 10 minutes seen in the prior art.

One or more of the nodes performing the above status discovery process may be, in some embodiments, one of the hosts on the network, or a dedicated device configured to act as a router (as that term and function is known in the art) with the added functionality necessary to implement the presently-disclosed methods. Alternately, one or more of the status-discovering nodes may be a specially-adapted hardware and/or software device dedicated to this function.

In an alternate embodiment, the local node may update its copy of the network routing table directly upon determining that a node on the network (or network segment) has not responded to the test message. The modified routing table may then be advertised and propagated to all other routers on the network.

Brief Description of the Drawings

The methods and systems of the invention may be better understood by referencing the following drawings, where like reference designations refer to like components or steps.

Figure 1 is a high-level block diagram of a Local Area Network (LAN) configured in accordance with one embodiment of the invention.

Figure 2 is a flowchart of a method for determining status and increasing LAN availability, according to one embodiment of the invention.

Description of the Illustrative Embodiments

Figure 1 is a block diagram of an illustrative LAN 100. The LAN 100 includes two network segments 112 and 114. The network segments 112 and 114 may be, for example, Ethernet networks, although the invention is equally applicable to other network protocols, and is not limited to a particular network protocol. The network segment 112 includes a plurality of communication links 120a-120d between nodes 125a-125c. Similarly, the network segment 114 includes a plurality of communication links 122a-122e for communicatively interconnecting the nodes 124a- 124e. The nodes 125a-125e and 124a-124e may be, for example, host computers, routers, load balancers, firewalls, or any other suitable network device. In the particular illustrative embodiment of Figure 1, the devices 125d, 124a and 124c are routers. The routers 125d and 124c can communicate with each other via the channels 130 and 132, thus enabling communications between the network segments 112 and 114.

The router 125d, in one exemplary embodiment, may be configured to act as one of the status-discovering nodes for the segment 114. As such, the router 125d sends messages to all external (to segment 114) nodes 124a-124e, one node at a time, to see if the communication channels to them (e.g., channels 130, 132, and 122a-122e) are operational. These messages may be LLC type 1 test frames, although any short test messages with a regular and predefined format may be used. The Logical Link Control (LLC) layer is the higher of the two data link layer sub-layers defined by the IEEE in its Ethernet standards. The LLC sub-layer handles error control, flow control, framing, and MAC-sub-layer addressing. The most prevalent LLC protocol is IEEE Standard 802.2, which includes both connectionless and connection-oriented variants.

To reduce intra-segment traffic, test frames may not be sent to locally attached nodes (e.g., nodes 125a-125c). For example, in one embodiment, only nodes outside of the network segment 112 (referred to herein as "destination" nodes) may be sent messages.

Return messages are generated by the destination nodes and sent back to the source node (i.e., the status-discovering node) for collection and matching to transmitted test messages. The return message may be a simple echo of the test message or a different, confirming message may be sent. Either way, the presence of a return message acknowledging (in some sense) the transmitted message provides a complete, end-to-end test of path continuity and therefore its status.

One advantage of using the LLC Type 1 test message is that it is purely a Layer 2 approach that does not propagate any overhead to Layer 3 or above in the protocol stack. Accordingly, the low overhead on the source and destination nodes makes for low round-trip delay and hence improved link fault detection timeliness. Note that this statusing approach differs from the link integrity test used to determine the health of a link as far back as lOBase-T Ethernet. As described in the Cisco Press, Internetworking Technology Handbook (online, at:

http://www.cisco.com/univercd/cc/td/doc/cisintwk/ito_doc/index.htni

in Chapter 2, (accessed September 20, 2002): lOBase-T was also the first Ethernet version to include a link integrity test to determine the health of the link. Immediately after power-up, the physical medium attachment (PMA) sublayer transmits a normal link pulse (NLP) to tell the NIC at the other end of the link that this NIC wants to establish an active link connection:

If the NIC at the other end of the link is also powered up, it responds with its own NLP.

If the NIC at the other end of the link is not powered up, this NIC continues sending an NLP about once every 16 ms until it receives a response.

The link is activated only after both NICs are capable of exchanging valid NLPs.

It is applicant's understanding that the lOBase-T integrity check is only used at initial power-up, to establish the link between the Network Interface Cards (NICs) in two hosts. The statusing mechanism herein described, by contrast, operates continuously to keep track of segment host status, i some exemplary embodiments, the status test message is sent approximately once per second to keep status information current.

Figure 2 is a flowchart 200 depicting a process for improving network ability according to one illustrative embodiment of the invention. The process begins on power-up of a status-detecting node, 210. Initially, each status-detecting node performs a discovery step 215 to identify its nearest (adjacent) network neighbors outside of the status host's own network segment and the status of those network neighbors, using any conventional mechanism. Alternatively, a status-detecting node may refer to initial status and adjacency information supplied to it in a local configuration file.

Next, the status-detecting node begins sending test messages 220 to each nearest neighbor not within the status-detecting node's 95 segment (e.g., where the status checking node is 125d, not within the network segment 112). After each message, the status-detecting node waits a pre-determined time (on the order of about 500 milliseconds) for a response, 230. Test 240 is a binary test on the reply received: if the reply matches the expected message (branch 242), then the channel or path is up and working. The status of that connection is then marked as "up" 244 in the local adjacency status table.

In some embodiments, the local adjacency status table is a separate table in the local routing information base (RLB); it may also be separate and distinct from the RLB. However, according to the illustrative embodiment, the adjacency status table is not a part of the local routing table when that term is used as implying a distinction from the RIB.

If the return message is not as expected or does not arrive at all within the predetermined wait time, branch 246 is taken and the link path status is marked as "down" in step 248.

In a preferred embodiment, the pre-determined wait time is specified in a configuration table (or file) supplied to the status discovery process or coded into software as a default value of, for example, one second. This link-specific wait time may be adjusted (not shown) according to the (known) speed of each link and the actual round-trip time (RTT) through mechanisms well-known to those of ordinary skill in the art. Thus, for distant (long) links operating at slow speeds, the discovery process will increase the link-specific wait time during the initial discovery. In particular, the method does not mark a link as "down" until it first verifies the RTT wait time by finding (and marking) the link as "up," as depicted by the secondary test 270.

In marking the link down in the adjacency status table, there may be several degrees of "down" indicated. The link may be down because it is overly congested, i.e., when no replies are received in the wait period for several tries. Alternately, the link may be marked down because the destination node is itself down or congested. Furthermore, the link may be down because the network or a segment thereof is down as signaled through for example, a routine routing table update. This information may be included by using different symbols for the different states or by encoding the information using two or more bits through methods well-known in the art.

The updated path status from either step 244 or 248 is then used to update the local node's adjacency status table 250, which in turn forces a Routing Information Base (RIB) update, 255. The process waits approximately one second, 260, before sending a test message to the next host in step 220, repeating the cycle indefinitely or until commanded to cease or power-down. (As noted above, in some embodiments, the wait time is dynamically adjusted to reflect the actual RTT to each node).

The wait durations described above are examples only. Longer or shorter wait times 230 (before declaring a lack of response message as a link "down" indicator) and 260 (recycle time between messages) are also useable. The length of wait determines the degree to which message traffic overhead (from the test messages and their responses) impact the overall network's performance. Longer waits (especially at recycle step 260) decrease message overhead, but at the cost of additional latency before status updates hit the router table and can be propagated through the network.

The illustrative method may be practiced by a single node, by a plurality of nodes, or by some or all nodes in a segment or network. When multiple nodes each act as independent status discoverers, very rapid RTB/routing table updates will result as nodes, links, or paths come up or go down. In such a scenario, link state information may be updated on the order of once every five or ten seconds, a significant improvement over prior methods of monitoring link status.

While particular embodiments of the invention have been shown and described, changes and modifications may be made without departing from the scope invention. By way of example, the illustrative steps of the invention are described above in a particular order. However, they may be performed other orders within the scope of the invention. Additionally, the methodology of the invention may be performed in hardware, software or any combination thereof. Additionally, the methods and systems of the invention maybe embodied in software, firmware, and/or microcode operating on a computer or computers of any type.

Claims

CLAIMS;

1. A method for identifying status of a communication channel in a local area network (LAN) comprising,

periodically transmitting a test message over a plurality of communication links from a source node of a source network segment to a plurality of destination nodes of at least one destination network segment outside of the source network segment,

monitoring at the source node for a return messages sent, in response to receiving the test message, from the plurality of destination nodes, and

logging at the host node communication channels associated with each of the plurality of destination nodes, for which the source node receives the return message, as being up and running.

2. The method of claim 1 comprising logging, at the host node, as down and not running, communication channels associated with each of the plurality of destination nodes, for which the source node does not receive the return message.

3. The method of claim 1 comprising determining whether a particular one of the destination nodes is up and running by waiting a preset wait time for the return message.

4. The method of claim 3, wherein the wait time is operator-adjustable.

5. The method of claim 1 comprising determining at the source node an expected roundtrip time from the source node to each of the plurality of destination nodes to determine how long to wait for the return message from any particular one of the destination nodes.

6. The method of claim 6 comprising automatically adjusting at the source node a wait time for the return message from a particular one of the destination nodes, based at least in part on the expected round trip time for the particular destination node.

7. The method of claim 1 comprising transmitting the test message from the source node to a subset of the plurality of destination nodes substantially concurrently.

8. The method of claim 1 comprising transmitting the test message from the source node to a subset of the plurality of destination nodes one at a time.

9. The method of claim 1 comprising, performing a discovery step at the source node to discover adjacent neighboring network segments outside of the source network segment prior to transmitting the test message.

10. The method of claim 1, wherein the test message includes a LLC type 1 frame format message.

11. The method of claim 1 comprising detecting an initial state of the network by observing a routing table at at least one of the source node and one of the plurality of destination nodes.

12. The method of claim 1, wherein the return message includes an echo of the test message.

13. The method of claim 1 comprising sending the test message a plurality of times to each of the destination nodes and marking a particular one of the destination nodes as being up and running in response to receiving at the host a preset number of return messages.

14. The method of claim 1 comprising updating a local adjacency status table to reflect changes of status in any of the plurality of destination nodes.

15. The method of claim 1 comprising incorporating the changes of status into a local RTB/routing table for propagation to other routers on the network.

16. The method of claim 1 wherein the source node is a router

17. The method of claim 1 wherein the plurality of destination nodes are routers.

18. A method for determining status of communication channels in a local area network, comprising, periodically transmitting one or more test messages over a plurality of communication channels from a plurality of source nodes of a source network to a plurality of destination nodes,

monitoring at the source nodes for a return messages sent, in response to receiving the test message, from any of the plurality of destination nodes, and

logging, at at least one of the plurality of host nodes, communication channels associated with each of the plurality of destination nodes for which the at least one of the source nodes receives the return message, as being up and running.

19. A system for improving availability comprising: a plurality of destination nodes in communication with a respective one of a plurality of destination network segments, each of the destination nodes configured to receive a test message tlirough one of a plurality of communication links and generate a return message; a source node in communication with each of the plurality of destination nodes, the source node configured to provide a test message to each of the plurality of destination nodes, and for determining the status of each of the plurality of communications links in response to the return messages; and a configuration update module in communication with the source node and the plurality of destination nodes, the configuration update module providing a status message to each of the destination nodes that provides a return message to the source node.

20. The system of claim 19, wherein the source node transmits the test message approximately once per second.

21. The system of claim 19, wherein the source node is a router.

22. The system of claim 19, wherein a subset of the plurality of destination nodes are routers.

23. The system of claim 19, wherein the test message is an LLC type 1 frame format.

24. The system of claim 19, wherein the return message is an echo message of the test message.

25. A method of improving network availability comprising,

in a first network segment having a plurality of nodes and first links therebetween, identifying one or more cooperating nodes on a second network segment, wherein the identifying includes periodically transmitting a test message over one or more paths from a source node in the first network segment to a destination node in the second network segment, wherein the second network segment includes a plurality of nodes and second links therebetween, and

in response to a return message received from the destination node, determining the status of the one or more paths; and providing the status to each of the plurality of destination nodes that generated a return message.

26. The method of claim 26 further comprising, configuring of the paths between the source node and the second network in response to determining the status.

27. A system for adjacency protection comprising,

a first network segment having a plurality of nodes and first links therebetween,

a second network segment having comprises a plurality of nodes and second links therebetween, wherein the first and second network segments are connected, and

a mechanism for identifying one or more cooperating nodes on the second network segment, by periodically transmitting a test message over one or more paths from a source node in the first network segment to a destination node in the second network segment, in response to a return message received from the destination node, determining the status of the one or more paths, and providing the status to each of the plurality of destination nodes that generated a return message.

28. The system of claim 28 comprising elements for configuring one or more of the paths between the source ode and the second network segment in response to the status of the one or more paths.