Search Images Maps Play YouTube News Gmail Drive More »
Sign in
Screen reader users: click this link for accessible mode. Accessible mode has the same essential features but works better with your reader.

Patents

  1. Advanced Patent Search
Publication numberUS20010000083 A1
Publication typeApplication
Application numberUS 09/726,679
Publication dateMar 29, 2001
Filing dateNov 29, 2000
Priority dateOct 28, 1997
Also published asUS6393526, US6442651, WO1999022316A1
Publication number09726679, 726679, US 2001/0000083 A1, US 2001/000083 A1, US 20010000083 A1, US 20010000083A1, US 2001000083 A1, US 2001000083A1, US-A1-20010000083, US-A1-2001000083, US2001/0000083A1, US2001/000083A1, US20010000083 A1, US20010000083A1, US2001000083 A1, US2001000083A1
InventorsDoug Crow, Bert Bonkowski, Harold Czegledi, Tim Jenks
Original AssigneeDoug Crow, Bert Bonkowski, Harold Czegledi, Tim Jenks
Export CitationBiBTeX, EndNote, RefMan
External Links: USPTO, USPTO Assignment, Espacenet
Shared cache parsing and pre-fetch
US 20010000083 A1
Abstract
The invention provides a method and system for reducing latency in reviewing and presenting web documents to the user. A cache coupled to one or more web clients request web documents from web servers on behalf of those web clients and communicates those web documents to the web clients for display. The cache parses the web documents as they are received from the web server, identifies references to any embedded objects, and determines if those embedded objects are already maintained in the cache. If those embedded objects are not in the cache, the cache automatically pre-fetches those embedded objects from the web server without need for a command from the web client. The cache maintains a two-level memory including primary memory and secondary mass storage. At the time the web document is received, the cache determines if any embedded objects are maintained in the cache but are not in primary memory. If those embedded objects are not in primary memory, the cache automatically pre-fetches those embedded objects from secondary mass storage to primary memory without need for a request from the web client. Web documents maintained in the cache are periodically refreshed, so as to assure those web documents are not stale. The invention is applied both to original requests to communicate web documents and their embedded objects from the web server to the web client, and to refresh requests to communicate web documents and their embedded objects from the web server to the cache.
Images(2)
Previous page
Next page
Claims(9)
What is claimed is:
1. A method, including the steps of
receiving web documents at a shared cache from a web server or mass storage for communicating said web documents to a web client for display;
parsing said web documents for references to embedded objects;
determining if said embedded objects are already maintained in said shared cache; and
conditionally pre-fetching said embedded objects from said web server in response to said step of determining, without need for a command from said web client.
2. A method as in
claim 1
, including the steps of
maintaining at said shared cache a two-level memory including primary memory and secondary mass storage;
locating said embedded objects in said shared cache but not in said primary memory;
conditionally pre-loading said embedded objects from said secondary mass storage into said primary memory in response to said step of locating, without need for a request from said web client.
3. A method as in
claim 1
, wherein said web documents include refresh copies of said web documents requested by said shared cache from said web server.
4. A system, including
a shared cache coupled to at least one web server and coupled to a plurality of web clients, said shared cache being capable of receiving requests for web documents from said web clients, requesting said web documents from said web server or mass storage, receiving said web documents from said web server or mass storage, and communicating said web documents to said web clients;
said shared cache including
means for parsing said web documents for references to embedded objects;
means for determining if said embedded objects are already maintained in said shared cache; and
means for conditionally pre-fetching said embedded objects from said web server in response to said means for determining, without need for a command from said web client.
5. A system as in
claim 4
, including
a two-level memory at said shared cache, said two-level memory including primary memory and secondary mass storage;
means for locating said embedded objects in said shared cache but not in said primary memory; and
means for conditionally pre-loading said embedded objects from said secondary mass storage into said primary memory in response to said means for locating, without need for a request from said web client.
6. A system as in
claim 4
, wherein said web documents include refresh copies of said web documents requested by said shared cache from said web server.
7. A shared cache, including
means for parsing said web documents, said web documents being received from a web server or from mass storage, for references to embedded objects;
means for determining if said embedded objects are already maintained in said shared cache; and
means for conditionally pre-fetching said embedded objects from said web server in response to said means for determining, without need for a command from said web client.
8. A cache as in
claim 7
, including
a two-level memory at said shared cache, said two-level memory including primary memory and secondary mass storage;
means for locating said embedded objects in said shared cache but not in said primary memory; and
means for conditionally pre-loading said embedded objects from said secondary mass storage into said primary memory in response to said means for locating, without need for a request from said web client.
9. A cache as in
claim 7
, wherein said web documents include refresh copies of said web documents requested by said shared cache from said web server.
Description
BACKGROUND OF THE INVENTION

1. 1. Field of the Invention

2. This invention relates to caches.

3. 2. Related Art

4. When presenting and reviewing data using a web browser or web client, that is, a client program for the web (the “World Wide Web”) such as Netscape Corporation's “Navigator” product or Microsoft Corporation's “Internet Explorer” product, it is desirable to present the data with as little delay as possible. If the user of the web client has to wait too long for the data to be displayed, this can lead to user dissatisfaction.

5. Some web clients access the web using a proxy cache, that is, a device for requesting web documents on behalf of the web client and for caching those web documents for possible later use. The proxy cache acts to reduce the amount of communication bandwidth used between the web client and web servers. A proxy cache can be shared by more than one web client, in which case it acts to reduce the total amount of communication bandwidth used between all of its web clients and web servers. One advantage of the proxy cache is that web documents stored in cache can be accessed more quickly than re-requesting those web documents from their originating web server.

6. One problem in the art is that a document requested by the web client (a “web document”) can include, in addition to text and directions for display, embedded objects which are to be displayed with the web document. Embedded objects can include pictures, such as data in GIF or JPEG format, other multimedia data, such as animation, audio (such as streaming audio), movies, video (such as streaming video), program fragments, such as Java, Javascript, or ActiveX, or other web documents, such as when using frames. The web client must parse the web document to determine the embedded objects, and then request the embedded objects from the web server.

7. While using a proxy cache ameliorates this problem somewhat, the problem persists. If there are many embedded objects in the web document, it can take substantial time to identify, request, communicate, and display all of them. Parsing and requesting embedded objects by the web client is serial, and most web clients are set to request only a small number of embedded objects at a time. Web clients requesting embedded objects perform this task in parallel with rendering those objects for display, further slowing operation.

8. Moreover, known proxy caches use a two-level memory having both primary memory and secondary mass storage. Even those embedded objects already maintained in the cache, and thus accessible by the web client without requesting them from the web server, might have been dropped out of the primary memory to secondary mass storage, possibly delaying communication of the embedded objects from the proxy cache to the web client and thus delaying display of those embedded objects to the user.

9. Accordingly, it would be advantageous to provide a method and system for reducing latency in reviewing and presenting web documents to the user. This advantage is achieved in a system in which web documents are parsed by a cache for references to embedded objects, and those embedded objects are pre-fetched from the web server or pre-loaded from secondary mass storage by the cache before they are requested by the web client.

10. Teachings of the art include (1) the known principle of computer science that devices work better when they are indifferent to the nature of the data they process, and (2) the known principle of client-server systems that it is advantageous to assign processing-intensive tasks to clients, rather than to servers, whenever possible. The invention is counter to the first teaching, as the cache alters its behavior in response to its parsing of the web documents it receives for communication to the client. The invention is also counter to the second teaching, as the cache takes on the additional processing tasks of parsing the web document for embedded objects and, if necessary, independently requesting those embedded objects from the web server.

SUMMARY OF THE INVENTION

11. The invention provides a method and system for reducing latency in reviewing and presenting web documents to the user. A cache coupled to one or more web clients request web documents from web servers on behalf of those web clients and communicates those web documents to the web clients for display. The cache parses the web documents as they are received from the web server, identifies references to any embedded objects, and determines if those embedded objects are already maintained in the cache. If those embedded objects are not in the cache, the cache automatically pre-fetches those embedded objects from the web server without need for a command from the web client.

12. In a preferred embodiment, the cache maintains a two-level memory including primary memory and secondary mass storage. At the time the web document is received, the cache determines if any embedded objects are maintained in the cache but are not in primary memory. If those embedded objects are not in primary memory, the cache automatically pre-loads those embedded objects from secondary mass storage to primary memory without need for a request from the web client.

13. In a preferred embodiment, web documents maintained in the cache are periodically refreshed, so as to assure those web documents are not “stale” (changed at the web server but not at the cache). The invention is applied both to original requests to communicate web documents and their embedded objects from the web server to the web client, and to refresh requests to communicate web documents and their embedded objects from the web server to the cache.

BRIEF DESCRIPTION OF THE DRAWINGS

14.FIG. 1 shows a block diagram of a system for shared cache parsing and pre-fetch.

15.FIG. 2 shows a flow diagram of a method for shared cache parsing and pre-fetch.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT

16. In the following description, a preferred embodiment of the invention is described with regard to preferred process steps and data structures. Those skilled in the art would recognize after perusal of this application that embodiments of the invention can be implemented using one or more general purpose processors or special purpose processors or other circuits adapted to particular process steps and data structures described herein, and that implementation of the process steps and data structures described herein would not require undue experimentation or further invention.

17. Inventions disclosed herein can be used in conjunction with inventions disclosed in one or more of the following patent applications:

18. Provisional U.S. Application 60/048,986, filed Jun. 9, 1997, in the name of inventors Michael Malcolm and Robert Zarnke, titled “Network Object Cache Engine”, assigned to CacheFlow, Inc., attorney docket number CASH-001.

19. U.S. application Ser. No. 08/959,058, filed this same day, in the name of inventors Michael Malcolm and Ian Telford, titled “Adaptive Active Cache Refresh”, assigned to CacheFlow, Inc., attorney docket number CASH-003.

20. These applications are referred to herein as the “Cache Disclosures,” and are hereby incorporated by reference as if fully set forth herein.

21. System Elements

22.FIG. 1 shows a block diagram of a system for shared cache parsing and pre-fetch.

23. A system 100 includes a cache 110, at least one client device 120, and at least one server device 130. Each client device 120 is coupled to the cache 110 using a client communication path 121, such as a dial-up connection, a LAN (local area network), a WAN (wide area network), or some combination thereof. Similarly, each server device 130 is also coupled to the cache 110 using a server communication path 131, such as a dial-up connection, a LAN (local area network), a WAN (wide area network), or some combination thereof. In a preferred embodiment, the client communication path 121 includes a LAN, while the server communication path 131 includes a network of networks such as an internet or intranet.

24. As used herein, the terms “client” and “server” refer to a relationship between the client or server and the cache 110, not necessarily to particular physical devices. As used herein, one “client device” 120 or one “server device” 130 can comprise any of the following: (a) a single physical device capable of executing software which bears a client or server relationship to the cache 110; (b) a portion of a physical device, such as a software process or set of software processes capable of executing on one hardware device, which portion of the physical device bears a client or server relationship to the cache 110; or (c) a plurality of physical devices, or portions thereof, capable of cooperating to form a logical entity which bears a client or server relationship to the cache 110. The phrases “client device” 120 and “server device” 130 refer to such logical entities and not necessarily to particular individual physical devices.

25. The server device 130 includes memory or storage 132 having a web document 133, the web document 133 including references to at least one embedded object 134. In a preferred embodiment, the web document 133 can include text and directions for display. The embedded object 134 can include pictures, such as data in GIF or JPEG format, other multimedia data, such as animation, audio (such as streaming audio), movies, video (such as streaming video), program fragments, such as Java, Javascript, or ActiveX, or other web documents, such as when using frames.

26. The cache 110 includes a processor 111, program and data memory 112, and mass storage 113. The cache 110 maintains a first set of web objects 114 in the memory 112 and a second set of web objects 114 in the storage 113. (Web objects 114 can comprise web documents 133 or embedded objects 134 or both.)

27. In a preferred embodiment, the cache 110 includes a cache device such as described in the Cache Disclosures defined herein, hereby incorporated by reference as if fully set forth therein.

28. The cache 110 receives requests from the client device 120 for a web object 114 and determines if that web object 114 is present at the cache 110, either in the memory 112 or in the storage 113. If the web object 114 is present in the memory 112, the cache 110 transmits the web object 114 to the client device 120 using the client communication path 121. If the web object 114 is present in the storage 113 but not in the memory 112, the cache 110 loads the web object 114 into the memory 112 from the storage 113, and proceeds as in the case when the web object 114 was originally present in the memory 112. If the web object 114 is not present in either the memory 112 or the storage 113, the cache 110 retrieves the web object 114 from the appropriate server device 130, places the web object 114 in the memory 112 and the storage 113, and proceeds as in the case when the web object 114 was originally present in the memory 112.

29. Due to the principle of locality of reference, it is expected that the cache 110 will achieve a substantial “hit rate,” in which many requests from the client device 120 for web objects 114 will be for those web objects 114 already maintained by the cache 110, reducing the need for requests to the server device 130 using the server communication path 131.

30. The cache 110 parses each web object 114 as it is received from the server device 130, separately and in parallel to any web client program operating at the client device 120. If the web object 114 is a web document 133 that includes at least one reference to embedded objects 134, the cache 110 identifies those references and those embedded objects 134, and determines if those embedded objects 134 are already maintained in the cache 110, either in the memory 112 or the storage 113.

31. If those embedded objects 134 are not in the cache 110 at all, the cache 110 automatically, without need for a command from the web client, requests those embedded objects 134 from the server device 130.

32. The cache 110 has a relatively numerous set of connections to the server communication path 131, and so is able to request a relatively numerous set of embedded objects 134 in parallel from the server device 130. Moreover, the cache 110 parses the web document 133 and requests embedded objects 134 in parallel with the web client at the client device 120 also parsing the web document 133 and requesting embedded objects 134. The embedded objects 134 are available to the cache 110, and thus to the client device 120, much more quickly.

33. If those embedded objects 134 are maintained in the cache 110, but they are in the storage 113 and not in the memory 112, the cache 110 automatically, without need for a command from the web client, loads those embedded objects 134 from the storage 113 into the memory 112.

34. In a preferred embodiment, those web objects 114 maintained in the cache 110 are periodically refreshed, so as to assure those web objects 114 are not “stale” (changed at the server device 130 but not at the cache 110). To refresh web objects 114, the cache 110 selects one web object 114 for refresh and transmits a request to the server device 130 for that web object 114. The server device 130 can respond with a copy of the web object 114, or can respond with a message that the web object 114 has not changed since the most recent copy of the web object 114 was placed in the cache 110. If the web object 114 has in fact changed, the cache 110 proceeds as in the case when a client device 120 requested a new web object 114 not maintained in the cache 110 at all. If the web object 114 has in fact not changed, the cache 110 updates its information on the relative freshness of the web object 114, as further described in the Cache Disclosures.

35. Method of Operation

36.FIG. 2 shows a flow diagram of a method for shared cache parsing and pre-fetch.

37. A method 200 includes a set of flow points to be noted, and steps to be executed, cooperatively by the system 100, including the cache 110, the client device 120, and the server device 130.

38. At a flow point 210, the client device 120 is ready to request a web document 133 from the server device 130. For example, the web document 133 can comprise an HTML page having a set of embedded objects 134.

39. At a step 221, the client device 120 transmits a request for the web document 133, using the client communication path 121, to the cache 110.

40. At a step 222, the cache 110 determines if that web document 133 is located in the memory 112 at the cache 110. If so, the cache 110 proceeds with the step 225. Otherwise, the cache 110 proceeds with the step 223.

41. At a step 223, the cache 110 determines if that web document 133 is located in the storage 113 at the cache 110 (but not in the memory 112). If so, the cache 110 loads the web document 133 from the storage 113 into the memory 112, and proceeds with the step 225. Otherwise, the cache 110 proceeds with the step 224.

42. At a step 224, the cache 110 transmits a request to the server device 130 for the web document 133. The server device 130 receives the request and transmits the web document 133 to the cache 110. The cache 110 stores the web document 133 in the memory 112 and the storage 113 and proceeds with the step 225.

43. At a step 225, the cache 110 transmits the web document 133 to the client device 120 for display. In parallel, the cache 110 parses the web document 133 and determines if there are any references to embedded objects 134. If not, the cache 110 proceeds with the flow point 230. Otherwise, the cache proceeds with the step 226.

44. At a step 226, the cache 110 identifies the embedded documents 134 and repeats the steps 222 through 226 inclusive (including repeating this step 226) for each such embedded document 134. Web documents 133 in “frame” format can refer to embedded documents 134 that are themselves web documents 133 and themselves refer to embedded documents 134, and so on. There is no prospect of an infinite loop if web document 133 is self-referential because the cache 110 will simply discover at the second reference that the web document 133 is already maintained in the cache 110.

45. At a flow point 230, the web document 133 and all its embedded objects 134 have been transmitted to the client device 120 for display.

46. When the cache 110 refreshes a web object 114, the cache 110 performs the steps 222 through 226 inclusive (including repeating the step 226) for the web object 114 and for each identified embedded object 134 associated with the web object 114.

47. Alternative Embodiments

48. Although preferred embodiments are disclosed herein, many variations are possible which remain within the concept, scope, and spirit of the invention, and these variations would become clear to those skilled in the art after perusal of this application.

Referenced by
Citing PatentFiling datePublication dateApplicantTitle
US6393526Oct 28, 1997May 21, 2002Cache Plan, Inc.Shared cache parsing and pre-fetch
US6715037Jul 26, 2002Mar 30, 2004Blue Coat Systems, Inc.Multiple cache communication and uncacheable objects
US7383289 *Dec 2, 2003Jun 3, 2008Sap AktiengesellschaftUpdating and maintaining data in a multi-system network using asynchronous message transfer
US7552223Apr 25, 2003Jun 23, 2009Netapp, Inc.Apparatus and method for data consistency in a proxy cache
US7631078 *Jan 16, 2007Dec 8, 2009Netapp, Inc.Network caching device including translation mechanism to provide indirection between client-side object handles and server-side object handles
US8082304Dec 10, 2004Dec 20, 2011Cisco Technology, Inc.Guaranteed delivery of application layer messages by a network element
US8090839Jun 21, 2006Jan 3, 2012Cisco Technology, Inc.XML message validation in a network infrastructure element
US8266327Jun 15, 2006Sep 11, 2012Cisco Technology, Inc.Identity brokering in a network element
US8275778 *Apr 4, 2011Sep 25, 2012Parallel Networks, LlcMethod and system for adaptive prefetching
US8458467Apr 5, 2006Jun 4, 2013Cisco Technology, Inc.Method and apparatus for adaptive application message payload content transformation in a network infrastructure element
US8463802Sep 30, 2010Jun 11, 2013Sandisk Il Ltd.Card-based management of discardable files
US8549229Sep 30, 2010Oct 1, 2013Sandisk Il Ltd.Systems and methods for managing an upload of files in a shared cache storage system
US20100235473 *Mar 9, 2010Sep 16, 2010Sandisk Il Ltd.System and method of embedding second content in first content
US20110185004 *Apr 4, 2011Jul 28, 2011Parallel Networks, LlcMethod and system for adaptive prefetching
USRE39306 *Feb 25, 2004Sep 26, 2006Matsushita Electric Industrial Co., Ltd.Optical disc device
EP1540498A2 *Sep 16, 2003Jun 15, 2005Network Appliance, Inc.Apparatus and method for proxy cache
WO2014085728A1 *Nov 28, 2013Jun 5, 2014Microsoft CorporationUnified search result service and cache update
Classifications
U.S. Classification711/130, 707/E17.12
International ClassificationG06F17/30
Cooperative ClassificationG06F17/30902
European ClassificationG06F17/30W9C
Legal Events
DateCodeEventDescription
Jan 29, 2014FPAYFee payment
Year of fee payment: 12
Jul 3, 2013ASAssignment
Owner name: JEFFERIES FINANCE LLC, AS COLLATERAL AGENT, NEW YO
Free format text: SECOND LIEN PATENT SECURITY AGREEMENT;ASSIGNOR:BLUE COAT SYSTEMS, INC.;REEL/FRAME:030740/0181
Effective date: 20130628
Oct 16, 2012ASAssignment
Effective date: 20121016
Free format text: RELEASE OF SECURITY INTEREST IN PATENT COLLATERAL RECORDED AT R/F 027727/0178;ASSIGNOR:JEFFERIES FINANCE LLC, AS COLLATERAL AGENT;REEL/FRAME:029140/0170
Owner name: BLUE COAT SYSTEMS, INC., CALIFORNIA
Feb 16, 2012ASAssignment
Owner name: JEFFERIES FINANCE LLC, NEW YORK
Effective date: 20120215
Free format text: FIRST LIEN PATENT SECURITY AGREEMENT;ASSIGNOR:BLUE COAT SYSTEMS, INC.;REEL/FRAME:027727/0144
Free format text: SECOND LIEN PATENT SECURITY AGREEMENT;ASSIGNOR:BLUE COAT SYSTEMS, INC.;REEL/FRAME:027727/0178
Jan 29, 2010FPAYFee payment
Year of fee payment: 8
Feb 3, 2006FPAYFee payment
Year of fee payment: 4
Sep 11, 2002ASAssignment
Owner name: BLUE COAT SYSTEMS, INC., CALIFORNIA
Free format text: CHANGE OF NAME;ASSIGNOR:CACHEFLOW, INC.;REEL/FRAME:013081/0212
Effective date: 20020820
Owner name: BLUE COAT SYSTEMS, INC. 650 ALMANOR AVENUESUNNYVAL
Free format text: CHANGE OF NAME;ASSIGNOR:CACHEFLOW, INC. /AR;REEL/FRAME:013081/0212