US 7231458 B2
A system and method for determining a chronometrically optimal web site location for access by a client based on proximity measurements on established connections that are a result of requests for actual content. In one embodiment, the technique involves a race condition between local domains each transmitting TCP packets loaded with a HyperText Markup Language (HTML) Base tag identifying that local domain. The earliest received TCP packet is incorporated into the TCP stream. In another embodiment, the technique involves a race condition between local domains each transmitting files having links to streaming media translated to point to its own local domain. The earliest received file is incorporated into the TCP stream while the others are discarded as TCP resends. In yet another embodiment, HTTP redirect operations are performed once for each grouping of links only the earliest redirect packet to reach the client is incorporated into the existing TCP stream.
1. Adapted for a network including a client and a plurality of local domains including at least a first local domain and a second local domain, a method comprising:
segmenting content including a Base Uniform Resource Identifier (URI) into multiple packets by a personal content director of the first local domain, a first packet of the multiple packets including the Base URI;
transmitting the first packet of the multiple packets by the personal content director of the first local domain to at least a personal content director for the second local domain;
substituting the Base URI for a HyperText Markup Language (HTML) Base tag within each of the first packets by the personal content directors of at least the first and second local domains, the HTML Base tag substituted by the personal content director of the first local domain now points to the first local domain and the HTML Base tag substituted by the personal content director of the second local domain now points to the second local domain; and
transmitting the first packets with the HTML Base tags by the personal content directors to the client;
incorporating an earliest first packet received by the client from the plurality of personal content directors into a data stream and disregarding the later received first packets; and
accessing the local domain associated with the personal content director that transmitted the first packet earliest received by the client for subsequent data requests.
2. The method of
retrieving content by the personal content director at the first local domain.
3. The method of
4. The method of
5. The method of
6. The method of
7. The method of
8. Adapted for a network including a client and a plurality of local domains including at least a first local domain and a second local domain, a method comprising:
substituting a Base Uniform Resource Identifier (URI) included in content for a HyperText Markup Language (HTML) Base tag within packets of the content by personal content directors of at least the first local domain and the second local domain, the HTML Base tag substituted by the personal content director of the first local domain now points to the first local domain and the HTML Base tag substituted by the personal content director of the second local domain now points to the second local domain; and
transmitting packets with the HTML Base tags by the personal content directors of the first local domain and the second local domain to the client;
incorporating an earliest received packet with the HTML Base tag by the client into a data stream and disregarding later received packets with the HTML Base tag; and
accessing the local domain associated with the personal content director that transmitted the earliest received packet by the client for subsequent data requests.
9. The method of
retrieving content by the personal content director at the first local domain;
segmenting content including the Base URI into multiple packets by the personal content director of the first local domain, a first packet of the multiple packets including the Base URI; and
transmitting the first packet of the multiple packets by the personal content director of the first local domain to at least a personal content director for the second local domain.
10. The method of
11. The method of
12. The method of
13. The method of
14. The method of
The invention generally relates to the field of communications. More particularly, the invention relates to an embodiment for managing traffic over a network.
As usage of the Internet has increased, various web sites have responded by adding such features as redundancy and load balancing. Accordingly, it has become important to direct a client to the geographically closest and least busy web site. The practice of dispersing data from a web site closer to the client is referred to as “client proximity” and has been widely accepted.
In an attempt to direct clients to the geographically closest and least busy web site, a number of techniques have been implemented. Current technologies that attempt to provide a measure of geographic distribution of Web requests rely primarily on Domain Name System (DNS) techniques; namely, techniques normally performed by a Domain Name System (DNS) server alone or in combination with other logic.
One prior technique is Round-Robin Domain Name System (DNS). This technique involves entering multiple IP addresses to represent a single DNS hostname. As clients resolve the hostname, DNS responds by cycling through the multiple listed IP addresses.
Another technique involves computation of routing metrics by the DNS server and these metrics being used to determine how far away a client is from various web sites forming the global domain. This allows the DNS server to answer a DNS request with a web site address associated with a local domain considered to be closest to the client. The primary DNS server can determine the distance from the client to each web site by counting network hops.
Another technique involves the use of a DNS server in conjunction with routers to approximate network distance to a web site from the requester. This technique is achieved through the announcement of a single IP address or a single set of IP addresses throughout the Internet resolving to a single hostname.
Yet another technique involves geographically distributed DNS servers that provide differing IP addresses on a per server basis. This technique is achieved through the announcement of a single IP address for authoritative DNS servers, which when queried may each provide a different response specifying the nearest web site.
While these DNS techniques provide some load sharing capabilities, they are inherently problematic because they are difficult and resource-intensive to resolve. In addition, the DNS solutions are incapable of being content aware and are, at best, useful in assisting a more robust approach by initially guiding a client to a web resource.
There are now attempts to develop a client proximity selection process using personal content directors (i.e. site selectors) in which clients are directed to “chronometrically optimal” locations, namely locations that provide the best overall response time, taking into account all factors including network topology (latency), server response time and the like. Exemplary client proximity processes are described in a commonly-owned, U.S. patent application Ser. No. 09/728,305 (filed Nov. 30, 2000) and a concurrently filed U.S. Patent Application entitled “A Method and Apparatus For Discover Client Proximity Using ILX Translations” (App. No. 10/027,686).
During “Refresh”, “Image Insert” and even “ILX” modes of operation, the client proximity process uses HTTP redirects to enable a client to point at another site after client proximity computations have completed. The use of HTTP redirects, being additional overhead, pose many disadvantages. For illustrative example of an HTTP redirect is set forth below in Table 1.
For instance, HTTP redirects cause additional traffic over the network and additional latency to complete these Transmission Control Protocol (TCP) connections. In addition, HTTP redirects are also visible by the client viewing his or her browser, which may be undesired by the web site owners. Also, bookmarks to a selected site after the client proximity computations may preclude such client proximity computations to be performed for that client in the future.
For streaming media environments, the overhead associated with redirects is small in comparison with the amount of information downloaded. While redirects are supported by the RTSP protocol used by QUICKTIME® players by Apple Computer of Cupertino, Calif. and REALPLAYER® by RealNetworks of Seattle, Wash., they are not supported by the MICROSOFT® Media Services (mms) protocol. A technique that can be utilized by many streaming video protocols is desired.
In general, the invention relates to a system and method for determining a chronometrically optimal web site location for access by a client based on proximity measurements on established connections that are a result of requests for actual content.
For one embodiment, at each web site, at least one TCP packet of downloaded content is loaded with a HyperText Markup Language (HTML) Base tag identifying that local domain and is subsequently sent to the client under a race condition. The earliest received TCP packet is incorporated into the TCP stream while the other competing TCP packets are considered resends and disregarded. The remaining packets forming the entire downloaded content are sent from the synchronizing personal content director (PCD). Hence, during that session, future requests are directed to the most proximate web site (web site associated with the earliest TCP packet).
For another embodiment, a single packet sized file (e.g., file configured in accordance with any metafile format) is retrieved by the synchronizing PCD and undergoes translations for various links to point to its local domain. All links may be subject to the same translations or one proximity computation is performed and remaining links remain unchanged. The file is also sent to other participating PCDs, which perform similar translations. The duplicated files are sent to the client by each PCD under a race condition, normally simultaneously, and the earliest received ASX file is incorporated into the TCP stream while the others are discarded as TCP resends. Hence, during that session, future requests are directed to the most proximate web site.
For yet another embodiment, the synchronizing PCD performs a grouping of link types upon retrieving a requested file. An HTTP redirect footrace is performed once for each group of links (the footrace process is performed as described in patent application Ser. No. 09/728,305). Each redirect string sent by remote PCDs directs the client to once again connect back to the synchronizing PCD. Only the first redirect packet to reach the client is incorporated into the existing TCP stream. Similar to the double-redirect method, the redirect string contains a tag that indicates which remote PCD won the footrace. The redirect races are performed sequentially, one for each group of links, until the proximity for all groups has been learned. At this point, the synchronizing PCD rewrites all links in all groups to point to the most proximate local domain (there is no need to send the metafile to the remote PCDs).
The features and advantages of the invention will become apparent from the following detailed description of the invention in which:
In general, one embodiment of the invention relates to a system and method for determining a chronometrically optimal (most proximate) web site location for access by a client based on proximity measurements on established connections that are a result of requests for actual content. For a Base Uniform Resource Identifier (URI) is substituted with an HTML Base tag by the PCDs. For a second embodiment, link translations are conducted by the PCDs on links found in retrieved files such as playlists for example. Yet, for a third embodiment, HTTP redirect footraces for groups of link types are iteratively conducted by the PCDs, and subsequently link translation (instead of HTTP redirect) is used as an execution method.
Certain details are set forth below in order to provide a thorough understanding of the invention, albeit the invention may be practiced through many embodiments other that those illustrated. Well-known logic and operations are not set forth in detail in order to avoid unnecessarily obscuring the invention.
In the following description, certain terminology is used to describe certain features of the invention. For example, a “personal content director” or “PCD” is a computing device that is adapted to a network (e.g., the Internet) in order to optimize performance of domains hosted on geographically distributed, mirrored web sites. Normally, a PCD comprises internal logic, namely hardware, firmware, software module(s) or any combination thereof, having an architecture generally equivalent to a computer (e.g., server, desktop, laptop, hand-held, mainframe, or workstation), set-top box, or a network switching device such as a router, bridge or brouter.
A “client” is a computing device that executes Web Browser software to communicate over a network in order to download information from a web site. Such information may include HyperText Markup Language (HTML) web pages, Microsoft ASX files or RealNetworks SMIL (Synchronized Multimedia Integration Language) or similar metafile formats from Apple QuickTime or others. In two embodiments, the HTML web page contains a metafile format that initiates execution of a streaming media player such as WINDOWS MEDIA™ player or REALPLAYER® as the final target client application
A “software module” is a series of instructions that, when executed, performs a certain function. Examples of a software module include an operating system, an application, an applet, a program or even a routine. One or more software modules may be stored in a machine-readable medium, which includes but is not limited to an electronic circuit, a semiconductor memory device, a read only memory (ROM), a flash memory, a type of erasable programmable ROM (EPROM or EEPROM), a floppy diskette, a compact disk, an optical disk, a hard disk, or the like.
As shown, web site 120 1 includes a personal content director (PCD) 140 1 and a server farm 150 1. A “server farm” is generally defined as a server or multiple servers operating in a collective manner. Similarly, web sites 120 2 and 120 3 include PCDs 140 2 and 140 3 and server farms 150 2 and 150 3, respectively. The PCD 140 1, 140 2 or 140 3 that receives the initial client request for that session is referred to as the “synchronizing PCD”, which orchestrates the client proximity selection process.
During the client proximity selection process, each PCD 140 1–140 3 is adapted to operate as a proxy server to communicate requests by client 110 to one or more servers 150 1–150 3. During normal operations, however, the servers 150 1–150 3 are in communication with the client 110 over network 130 as shown by routing paths 151 1–151 3. Moreover, each PCD 140 1–140 3 is further configured to communicate with each other and client 110 using appropriate Transmission Control Protocol (TCP) communication over network 130 as shown by paths 152 1–152 3.
For this embodiment, network 130 is any type of wide area network (WAN) such as the Internet. Of course, network 130 may be configured as a type of local area network (LAN) while still maintaining the spirit and scope of the invention.
Although not shown, system 100 may include an authoritative Domain Name Server (DNS) that resolves (translates) an alphanumeric, web site domain name into addresses recognized by the PCDs 140 1–140 3. For the system 100, multiple distributed web sites 120 1–120 3 appear to be a single domain (referred to as a “global domain”). A global domain may be separated into multiple “real” web sites, each referred to herein as “local domains” and uniquely registered in DNS with a unique uniform resource locator (URL) for example. To participate in the client proximity selection process, a PCD needs to be configured as a member of that local domain. Each PCD can have more than one global domain; a mirror web site can support more than one local domain; and a PCD can be a member of more than one local domain.
For example, in one embodiment, the web site domain “www.nortelnetworks.com” may be resolved into one of a plurality of IP addresses that is associated with a participating PCD 140 1–140 3 of web sites 120 1–120 3, respectively. These IP addresses are entered into DNS as the addresses for the web site domains shown in Table 1. Such selection as to which IP address may be
Referring now to
Processor 220 controls the retrieval of content from server(s) at the web site 120, of
Memory 230 is loaded with (i) software modules executable by processor 220 to support client proximity selection process operations of web site 120 1 and (ii) perhaps a client network cache (CNC) 240. When employed, CNC 240 can be configured to store representations of client network addresses associated with client/local domain responses. For two modes of operation, “Page Race” and “Meta File Race” to use CNC, for one embodiment, an additional HTTP or RTSP link may be inserted within the transmitted page or metafile. This link would point to the synchronizing PCD, and would merely cause the client to initiate a connection indicating which local domain won the race. “Link-by-Link Meta File Translation” would not require such links because it provides a redirect to the synchronizing PCD already Upon receiving an initial client request (e.g., HTTP GET request), PCD 140 1 acts as the synchronizing PCD. Upon finding an entry 250 in its CNC 240, synchronizing PCD 140 1 would direct client 110 of
A. Page Race Mode
Referring now to
Upon receiving the request, the synchronizing PCD proxies server(s) of its Web site for requested content forming the Web page (block 310). For instance, the requested content may form a home page (default.html). The home page may include an explicit Base Uniform Resource Identifier (Base URI) HTML tag (referred to as “Base URI”), which is used as the base of an HTML document for the purpose of resolving hyperlink destinations. For instance, the Base URI may be represented as <Base URI=“www.nortelnetworks.com”>. In the event that a Base URI is not explicitly specified, the client software will use the requested URI (the URI of the original request) as the Base URI.
As described in blocks 320 and 330, synchronizing PCD receives the content from the server(s) and segments the content into multiple TCP packets. The first TCP packet includes the base URI. Thereafter, only the first TCP packet including the Base URI is sent to all participating PCDs to set up a race condition. As an alternative embodiment, the synchronizing PCD may substitute the Base URI for an HTML Base tag to point to its own local domain before the content is segmented. The HTML base tag may be represented as <Base href=“wwwa.nortelnetworks.com”>. Hence, the TCP packet routed to the other PCDs would include the HTML Base tag for the synchronizing PCD, which would be replaced as described.
Referring now to
For example, at the synchronizing PCD 140 1, the Base URI 435 is substituted for a HTML Base tag 500, namely <Base href=“wwwa.nortelnetworks.com”>. Similarly, at PCD 140 2, the Base URI is substituted for a different HTML Base tag 510, namely <Base href=“wwwb.nortelnetworks.com”>. At PCD 140 3 of
Referring back to
Although not shown, in accordance with another embodiment, the first TCP packets may be transmitted generally concurrent with each other. For this embodiment, the PCDs may impose transmission time delays to account for different loads at the sites (load-balancing).
The earliest first TCP packet with the HTML Base tag received by the client is incorporated into the TCP stream while the other first TCP packets are disregarded as TCP resends (block 360). The other packets forming the web page are reassembled with the earliest first TCP packet in accordance with their sequence numbers (block 370). In response to the next HTTP GET request, the web site with optimal client proximity (winner of the race condition identified by the accepted HTML Base tag) is now accessed during that session (block 380). It is contemplated that once the session is discontinued or disrupted, the client proximity selection process may need to be conducted to ensure optimal network performance.
B. “Meta File Race” Mode
Referring now to
Initially, during the client proximity selection process, a request (e.g., HTTP GET request for a metafile, such as a Microsoft ASX or RealMedia SMIL file; such files may be identified by filename extention, e.g. “.asx” or “.ram”, or by file content) is received by the synchronizing PCD (block 600). The request may be directed from the client or from the DNS server, which selects the synchronizing PCD and forwards the request thereto.
Upon receiving the request, the synchronizing PCD proxies server(s) of its Web site for requested content (block 610). In response to receiving the request, one of the servers returns a file (most likely a dynamically generated file) configured in accordance with any the metafile format desired (block 620). The file may be a “playlist,” namely a list of linked titles for streaming media capable of being downloaded from the server(s). The selection of which media links are placed within the file may be based on demographics or prior site accesses by the user.
For example, the file may be configured in an ASX metafile format used with WINDOWS MEDIA™ services supported by Microsoft Corporation of Redmond, Wash. This particular configuration will be described in this embodiment for illustrative purposes only. Alternatively, of course, the file may be configured in RAM or SMIL metafile formats supported by a media player produced by RealNetworks of Seattle, Wash. or even another type of metafile format.
Thereafter, the server returns the file (e.g., ASX file) to the synchronizing PCD (block 630). The synchronizing PCD parses the links of the ASX file under predetermined link translation rules to determine whether any translations are warranted and conducts link translations (block 640). These “link translation rules” indicate which links may or may not undergo link translations during the client proximity selection process.
For example, as shown in
Referring back to
In one embodiment, all PCDs are synchronized to transmit the translated packet (ASX file), where synchronization is accomplished through time synchronization messages exchanged by the PCDs over separate TCP connections via the command interface. This enables generally simultaneous transmissions or generally concurrent transmissions where transmission delays may occur to account for different loads at the sites (load-balancing).
The earliest translated packet received by the client is placed within the TCP stream while subsequent packets are disregarded as TCP resends (block 680). Thereafter, the client retrieves media from the most proximate site during that session (block 690).
C. “Link-by-Link Meta File Race” Mode
Referring now to
Upon receiving the request, the synchronizing PCD proxies server(s) of its Web site for requested content (block 810). In response to receiving the request, one of the servers dynamically generates a file (e.g., a playlist) configured in accordance with any the metafile format desired (block 820).
For example, the file may be configured in an ASX metafile format used with WINDOWS MEDIA™ services supported by Microsoft Corporation of Redmond, Wash. This particular configuration will be described in this embodiment for illustrative purposes only. Alternatively, of course, the file may be configured in PAM, SMIL or another type of metafile format.
Thereafter, the server returns the file (e.g., ASX file) to the synchronizing PCD (block 830). The synchronizing PCD groups the links within the file according to a particular granularity provided by the parsing rules followed by the internal logic of the PCD (block 840). Thereafter, the synchronizing PCD creates a redirect packet for a first grouping and provides such packets to the participating PCDs to perform a foot race operation via HTTP or RTSP (block 850 and 860). The client sends a redirect back in response to the earliest received redirect packet from the synchronizing PCD and participating PCDs (block 870). This redirect indicates the most proximate site (identification of the web site from which the earliest received redirect packet is associated). This process continues until all groupings have been handled, even a grouping with a single link (block 880). After synchronizing PCD has information regarding the proximate (winning) local domains for each grouping, the synchronizing PCD now translates the relative links in the ASX file accordingly for transmission to the client (block 890).
For example, as shown in