|Publication number||US7394836 B2|
|Application number||US 10/704,354|
|Publication date||Jul 1, 2008|
|Filing date||Nov 6, 2003|
|Priority date||Dec 13, 2002|
|Also published as||US20040114602|
|Publication number||10704354, 704354, US 7394836 B2, US 7394836B2, US-B2-7394836, US7394836 B2, US7394836B2|
|Inventors||Nam Seok Ko, Dong Yong Kwak|
|Original Assignee||Electronics And Telecommunications Research Institute|
|Export Citation||BiBTeX, EndNote, RefMan|
|Patent Citations (11), Non-Patent Citations (7), Referenced by (5), Classifications (10), Legal Events (3)|
|External Links: USPTO, USPTO Assignment, Espacenet|
This application claims priority from Korean Patent Application No. 2002-79730, filed on Dec. 13, 2002, the disclosure of which is incorporated herein by reference in their entirety.
1. Field of the Invention
The present invention relates to a packet scheduling system and method, and more particularly, to a packet scheduling system and method capable of fair link resource distribution among a plurality of sessions requesting transmissions to an identical output link in input and output interfaces of a node, such as an asynchronous transfer mode (ATM) switch or a router, in a high-speed packet exchange network such as an ATM network or the Internet.
2. Description of the Related Art
Generally, packet scheduling algorithms are broken down into two methods according to operation modes of a server: one is a work-conserving method, by which the server provides services continuously unless there are no waiting packets in a queue, and the other is a non work-conserving method by which the server does not provide services and maintains a waiting state if predetermined conditions are not satisfied even though there is a packet waiting for services in a queue. In the latter method, though there is a waiting packet the server does not provide services such that the utilization efficiency of the server is relatively low but performance requirements determined in advance, such as a delay bound or a delay jitter limit, can be accurately guaranteed. Meanwhile, in the former method, the utilization efficiency of the server is higher but a minimum performance for each session should be guaranteed irrespective of the operations of other sessions and a mechanism capable of fair distribution of a contract speed for each session and the like for a plurality of requests to use idle resources of the server should be considered.
A theory underlying the work-conserving method is a generalized processor sharing (GPS) algorithm disclosed in an article by Abhay K. Parekh and Robert G. Gallager, “A generalized processor sharing approach to flow control in integrated services networks: The single node case” vol. 1, pp. 344-357, IEEE/ACM Transactions on Networking, June 1993. The article provides a hypothetical theory in which all input traffic is modeled as a flow of fluid and a server is used as a scheduling system that provides services at the same time to all links requesting services so that idealistic fairness and optimal delay bound values are provided. However, since the minimum unit of traffic is a packet in an actual packet network environment and a server can transmit only one packet at a time, it is impossible to actually implement the system suggested in the article and the article is only providing conceptual performance criteria which all process scheduling algorithms aim at.
As one of the algorithms that are implemented in actual network environments, closest to the concept of the GPS algorithm described above, there is a weighted fair queueing (WFQ) which assumes that a basic unit of ingress traffic is a packet. The WFQ algorithm introduces a concept of virtual time for displaying the operation state of a GPS server because the operation of the GPS server should be imitated as if traffic modeled as the flow of fluid are all serviced at the same time. A system virtual time is updated with the amount of work performed in the GPS server in each event occurring on the GPS system, such as arrival of a new packet or departure of a packet being serviced. The updated virtual finish time is used as a time stamp of a packet. This algorithm can provide performance close to the GPS algorithm, but in the worst case, while one packet is being transmitted, new packets from all links may arrive such that calculation and update of the virtual time may be repeated with the same frequency as the number of the links. Accordingly, it is actually difficult to implement the algorithm in a high-speed network environment requiring high-speed determination of transmission order of packets.
In a scheduling algorithm which was suggested to improve the WFQ algorithm with this complexity-to-implement problem, there is a self clocked fair queueing (SCFQ) algorithm. Unlike the WFQ algorithm, this algorithm does not have the continuous GPS simulation step and when a new packet arrives, it regards the time stamp of the packet as a virtual time. As a result, packets which arrived during one transmission share an identical virtual time such that the complexity due to calculation of the virtual time can be greatly reduced. However, in the worst case, when an arbitrary packet arrives, all session can have the time as the time stamp being serviced such that the packet which is newly arrived has to wait until packets of all sessions are transmitted, irrespective of complying with traffic rules. In this case, the waiting time is proportionate to the number of linked sessions. Thus, the SCFQ algorithm improves the complexity due to calculation of the virtual time but degrades the delay bound that can be guaranteed, compared to the WFQ algorithm.
As a method to solve the drawback of the SCFQ algorithm, there is a starting potential fair queueing (SPFQ) algorithm. In the SPFQ algorithm, whenever transmission of a packet finishes, the virtual time is reset to the smallest value among the virtual start time values of packets in the head of each queue. The SPFQ algorithm has an advantage of maintaining good performance but has a disadvantage of an arrangement process for the virtual start time which is additionally needed.
As a method to improve the calculation complexity of the WFQ algorithm with an approach similar to that of the SCFQ algorithm, there is a minimum-delay self-clocked fair queueing (MD-SCFQ) algorithm. As the SCFQ algorithm, the MD-SCFQ algorithm does not have a separate step for simulation of the GPS, and calculates the system virtual time by using information on the front packet waiting in a queue so that it can reduce the complexity of calculation compared to the WFQ algorithm and improve the delay characteristic compared to the SCFQ algorithm. However, this algorithm has a disadvantage of overhead for continuously collecting and managing information on packets waiting in a queue at a time when the system virtual time is calculated.
Thus, the work-conserving type scheduling algorithms have different methods for calculating and maintaining virtual time functions and time stamps for respective systems. In a scheduling system for a high-speed network environment, the most important performance parameter is the simplicity of a system related to high-speed determination of a transmission order. When this is considered, it can be said that the closer to the GPS, a scheduling system can maintain the fairness and delay characteristics that are basic requirements of a work-conserving scheduling system while maintaining this simplicity, the better the scheduling system is. Therefore, a new packet scheduling system and method are needed in which the delay bound and fairness index of the WFQ level can be guaranteed in a switch or a router in a high-speed packet exchange network and the calculation complexity O(1) of the system virtual time can be maintained.
The present invention provides a packet scheduling system and method by which the delay bound and fairness index of the WFQ level can be guaranteed in a switch or a router in a high-speed packet exchange network and the calculation complexity O(1) of the system virtual time can be maintained.
The present invention also provides a packet scheduling system and method by which when the transmission order of packets is determined a fair and optimal delay bound can be provided.
The present invention also provides a computer readable recording medium having embodied thereon a computer program for the packet scheduling method.
According to an aspect of the present invention, there is provided a packet scheduling system comprising: a traffic classifier which classifies traffic input from a plurality of input links, for each session; a central management unit which manages the agreed speed for each session and the virtual time of a system; a virtual finish time calculation unit which in response to the agreed speed and the system virtual time, calculates the virtual finish time of each packet for the traffic and attaches the calculated virtual finish time to the head of the packet as a time stamp; a packet queue which stores the packet sent by the virtual finish time calculation unit, for each session; and a packet transmission unit which selects and outputs a packet having a shortest virtual finish time among packets stored in the packet queue.
According to another aspect of the present invention, there is provided a packet scheduling method comprising: (a) classifying traffic input from a plurality of input links for each session; (b) in response to an agreed speed for each session and the virtual time of a system provided by a central management unit, calculating the virtual finish time of each packet for the traffic and attaching the calculated virtual finish time to the header of the packet as a time stamp; (c) storing the packet, to which the virtual finish time is attached, in a packet queue for each session; and (d) selecting a packet having a shortest virtual finish time among packets stored in the packet queue and outputting the packet.
The above objects and advantages of the present invention will become more apparent by describing in detail preferred embodiments thereof with reference to the attached drawings in which:
Each input interface (10 a-10 m) installed in the input interface 10 performs the packet scheduling according to the present invention for traffic input from a plurality of input links and sends the scheduling result to the switch fabric 20 through the input port 17 of the switch fabric 20.
The switch fabric 20 switches the input traffic provided from the input interface unit 10 and then sends the input traffic to the output interface unit 30 through the output port 27. Each output interface (30 a-30 n) installed in the output interface 30 receives the traffic provided through the switch fabric 20, performs packet scheduling according to the present invention, and outputs the result to the output link.
The traffic classifier 11 performs multiple field-based traffic classification for the traffic input from the plurality of input links 101. The virtual finish time calculation unit 14 of the traffic scheduler 12 calculates the virtual finish time for each packet for the classified traffic and attaches the calculated virtual finish time to the head of each packet as a time stamp. The packet queue 16 stores the packets, in which each header has an attached virtual finish time, for each session. The packet transmission unit 17 performs actual transmission of packets. The central management unit 15 is connected to the finish time calculation unit 14, the packet queue 16, and the packet transmission unit 17 and manages the agreed speed for each session, the virtual time of the system and the like.
Each session of the traffic input from the input link 101 is classified by the traffic classifier 11. If the fields of the packet used for traffic classification correspond to a 3-layered protocol as the Internet protocol, sessions are classified based on multiple fields including the source address of an IP, the destination address, the protocol fields, and the port number. For ATM, sessions are classified based on the virtual path identifier (VPI) and virtual channel identifier (VCI) values, and for multiprotocol label switching (MPLS), sessions are classified based on the label value of an MPLS header.
The traffic classified into each session is input to the virtual finish time calculation unit 14. The virtual finish time calculation unit 14 calculates a virtual finish time for each packet based on the system virtual time managed by the central management unit 15 and the virtual finish time of the previous packet of the corresponding session and stores the calculated result in a packet queue 16 for the corresponding session. The packet transmission unit 17 first selects a packet having a shortest virtual finish time in each session among the packets stored in the packet queue 16 and sends to the output link.
The detailed operations performed in the virtual finish time calculation unit 14 an the packet transmission unit 17 after the traffic input from each input link is classified for each session through the traffic classifier 11 will now be explained.
The system virtual start time calculator 142 fetches the agreed speed for a session and the system virtual time from the central management unit 15. The bigger value between the virtual finish time of a packet which belongs to a session, to which the current packet belongs, and arrived previous to the current packet, and the system virtual time at a current time is determined as the system virtual start time by the system virtual start time calculator 142. The system virtual finish time calculator 144 calculates the system virtual finish time, based on the virtual start time calculated by the virtual start time calculator 142, the speed of a session to which the packet belongs, and the length of the packet.
Then, the system virtual start time calculation unit 142 of the virtual finish time calculation unit 14 compares the system virtual time value (υ(τ)temp) calculated by the equation 1 with the virtual finish time (Fi k-1) of the previous packet, and sets the virtual start time (Si k) of the packet to the bigger one of the two values in step 1453. This can be expressed as the following equation 2:
S i k=max(F i k-1, υ(t)temp) (2)
If the virtual start time (Si k) is calculated according to the equation 2, the system virtual finish time calculation unit 144 of the virtual finish time calculation unit 14 calculates the virtual finish time (Fi k) as the following equation 3, by using the virtual start time (Si k) calculated in the step 1453, the length of the packet (li k), and the session speed (ri) of the session corresponding to the packet in step 1454:
If the virtual finish time (Fi k) is calculated according to the equation 3, the virtual finish time calculation unit 14 attaches the calculated virtual finish time (Fi k) to the header of each packet as a time stamp and transmits the packet to the packet queue 16 of each session (Refer to the step 1470 of
As described above, if the transmission of a packet being currently transmitted is completed, the packet transmission unit selects and outputs a packet to be transmitted next time, at a time when the transmission of the packet is completed, and readjusts the system virtual time. The detailed structure and operations for this will now be explained.
The packet list manager 172 manages a packet list stored in the packet queue 16, based on the virtual finish time of a packet calculated by the virtual finish time calculation unit 14. The packet transmitter 174 selects a packet having the shortest virtual finish time in the packet list managed by the packet list manager 172, and outputs the packet to the output link, and then sends a system virtual time update interrupt so that the system virtual time can be readjusted.
actually taken to transmit the packet, to the system virtual time (υ(τj-1)) at a time (τj-1) when transmission of the previous packet is completed:
Then, the packet transmission unit 17 selects a packet having the shortest virtual finish time among packets stored in the packet queue 16 and transmits the packet in step 1703. When it is assumed that the virtual finish time of this packet is TScur, the operation for readjusting the system virtual time is performed as the following equation 5 in step 1704:
Here, Li denotes the maximum size of a packet in session i, and B(τj) denotes a set of sessions having packets to be transmitted at time τj.
As described above, according to the packet scheduling method of the present invention, an identical time stamp for packets which newly arrive while a packet is being transmitted is calculated based on an identical criterion. As a result, the complexity due to calculation of the system virtual time is improved to the value of O(1). This can be said that the complexity is improved by N times when compared with the WFQ packet scheduling system in which update of the system virtual time (that is, the virtual time) is depending on the number (N) of sessions and is expressed as O(N).
However, the existing SCFQ packet scheduling system also has the complexity of O(1) because all packets which arrive while one packet is processed calculate their time stamps based on the time stamp (that is, the system virtual time) of the packet being currently serviced. However, in the SCFQ packet scheduling system, though the complexity is improved, the scheduling system has a characteristic that the maximum value of the delay occurring between the arrival of an arbitrary packet to the transmission of the packet depends on the number of sessions. Accordingly, it is difficult to apply the scheduling system to a node having a large volume of ingress traffic. In addition, the SPFQ packet scheduling system has a drawback that the arranged order is maintained by the virtual start time and as a result the complexity of calculation becomes O(log N).
Meanwhile, the MD-SCFQ packet scheduling system reduces the complexity of calculation of the virtual time and improves the performance of the delay and fairness index, but still has a drawback that additional complexity for calculation of the virtual time exists.
Accordingly, the packet scheduling system and method according to the present invention improve the conventional packet scheduling systems with the problems described above such that while the complexity of O(1) is maintained, the maximum delay limit as the WFQ packet scheduling system level can be guaranteed and the fairness index value corresponding to that of the WFQ packet scheduling system is provided. Therefore, the packet scheduling system and method according to the present invention has advantages that it can be applied to a high-speed packet exchange node, and that it can be applied to fixed-sized packets such as ATM packets as well as to the variable-sized packets for the Internet.
The present invention may be embodied in a code, which can be read by a computer, on a computer readable recording medium. The computer readable recording medium includes all kinds of recording apparatuses on which computer readable data are stored. The computer readable recording media includes storage media such as magnetic storage media (e.g., ROM's, floppy disks, hard disks, etc.), optically readable media (e.g., CD-ROMs, DVDs, etc.) and carrier waves (e.g., transmissions over the Internet). Also, the computer readable recording media can be scattered on computer systems connected through a network and can store and execute a computer readable code in a distributed mode.
The packet scheduling system and method according to the present invention as described above can guarantee the delay bound and fairness index as the WFG level in a switch or a router on a high-speed packet exchange network, and can maintain the complexity of O(1) of calculation of the system virtual time. Accordingly, when the transmission order of packets is determined, a fair and optimal delay bound can be provided.
|Cited Patent||Filing date||Publication date||Applicant||Title|
|US5850399 *||Mar 27, 1998||Dec 15, 1998||Ascend Communications, Inc.||Hierarchical packet scheduling method and apparatus|
|US5859835||Apr 15, 1996||Jan 12, 1999||The Regents Of The University Of California||Traffic scheduling system and method for packet-switched networks|
|US5905730 *||Mar 27, 1998||May 18, 1999||Ascend Communications, Inc.||High speed packet scheduling method and apparatus|
|US5991812 *||Mar 6, 1997||Nov 23, 1999||Controlnet, Inc.||Methods and apparatus for fair queuing over a network|
|US6075791||Oct 28, 1997||Jun 13, 2000||Lucent Technologies Inc.||System for guaranteeing data transfer rates and delays in packet networks|
|US6081507 *||Nov 4, 1998||Jun 27, 2000||Polytechnic University||Methods and apparatus for handling time stamp aging|
|US6134217||Apr 16, 1996||Oct 17, 2000||The Regents Of The University Of California||Traffic scheduling system and method for packet-switched networks with fairness and low latency|
|US6396843||Oct 30, 1998||May 28, 2002||Agere Systems Guardian Corp.||Method and apparatus for guaranteeing data transfer rates and delays in data packet networks using logarithmic calendar queues|
|JP2001519121A||Title not available|
|JP2002118585A||Title not available|
|KR20010000087A||Title not available|
|1||A Self-Clocked Fair Queueing Scheme for Broadband Applications, pp. 636-646, 1994.|
|2||Efficient Fair-Queueing Algorithms for Packet-Switched Networks, 12 pages, 1998.|
|3||Efficient Fair-Queueing Algorithms, 36 pages, 1998.|
|4||IEEE/ACM Transactions on Networking, vol. 1, No. 3, Jun. 1993, pp. 344-357.|
|5||Minimum-Delay Self-Clocked Fair Queueing Algorithm for Packet-Switched Networks, pp. 1112-1121, 1998.|
|6||Rate-Proportional Servers: A Design Methodolgy For Fair Queueing Algorithms,27 pages, 1998.|
|7||Service Disciplines for Guaranteed Performance Service in Packet-Switching Networks, pp. 1-23, 1995.|
|Citing Patent||Filing date||Publication date||Applicant||Title|
|US7961630 *||Jun 14, 2011||Agilent Technologies, Inc.||Methods and apparatus for stimulating packet-based systems|
|US8375433 *||Feb 12, 2013||Tsinghua University||Method for multi-core processor based packet classification on multiple fields|
|US20090086749 *||Sep 27, 2007||Apr 2, 2009||Bruce Alan Erickson||Methods and apparatus for stimulating packet-based systems|
|US20100192215 *||Jan 19, 2010||Jul 29, 2010||Tsinghua University||Method for Multi-Core Processor Based Packet Classification on Multiple Fields|
|DE102014112901A1 *||Sep 8, 2014||Mar 10, 2016||Phoenix Contact Gmbh & Co. Kg||Kommunikationseinrichtung, Kommunikationssystem und Verfahren zum synchronisierten Senden von Telegrammen|
|U.S. Classification||370/537, 370/230, 370/412, 370/419|
|International Classification||H04L12/875, H04L12/815, H04L12/863, H04J3/02|
|Nov 6, 2003||AS||Assignment|
Owner name: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTIT
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:KO, NAM SEOK;KWAK, DONG YONG;REEL/FRAME:014689/0352
Effective date: 20030811
|Dec 28, 2011||FPAY||Fee payment|
Year of fee payment: 4
|Dec 24, 2015||FPAY||Fee payment|
Year of fee payment: 8