This invention pertains to global information networks, currently referred to as the Internet or Internet systems, and in particular, to a system for providing a comprehensive global information network broadcasting system and the methods of implementing the same using broadcast links to overcome the limitations in network distribution and caching systems inherent in conventional designs.
BACKGROUND OF THE INVENTION
The explosion of the use of Internet and other similar systems has created massive performance demands on the Internet Protocol (IP) and the communication infrastructure associated with the Internet. The areas which are experiencing this communication and application explosion may include any IP network or Internet, public or private, or any group of computers connected together. The present invention has particular application in the current system referred to as the Internet.
The performance demands on the network are further compounded by the inherent limitations in the IP network architecture and the popularity of certain applications on the network. Some of the most popular applications on the Internet, such as the web browser, construct, or attempt to construct, a point-to-point or end-to-end connection across the network. With the Internet browser application, the Internet participant “points” the web browser to a universal resource location (“URL”) address which, in turn, the browser uses to attempt to connect to the network and display the information at the URL address.
An end-to-end connection across the network makes network performance parameters such as latency and network queuing delays into factors that dependent, at least in part, on each link in the point-to-point chain of connection. Since IP also has inherent data concentration characteristics, the performance of the network may be significantly degraded by traffic concentration on the network backbones. Thus, network performance, e.g., network latency, is often dominated by the latency of the most congested link. Thus, a problem in the conventional IP network is that “end-to-end” latency may be dominated by the link with the greatest congestion. Data concentration may cause a high latency on over-subscribed backbone links.
A problem related to network congestion and data concentration is the present rate of growth in the popularity of the Internet and it's applications. The present rate of growth makes increases in network performance, or even maintaining network performance, simply by increasing backbone size a problematic solution, e.g., at the current rate of growth in Internet usage, backbones and communication equipment may require replacement before their costs can be recovered. Thus, the conventional architecture and pricing structure for Internet service may not be self financing in some instances.
Another systemic source for network demand is the increase in the number of times that the network is being called upon to move the same data to multiple users. In practice this may be caused by the increasing popularity of particular website or the so called web portals.
The transport of redundant date problem has been addressed, in part, through the use of network caches. Network caches store data inside the network and service the user demand for data from data stored in the cache. Thus, network caches may reduce the number of identical items which are being passed end-to-end through the network by locally servicing the request for data from the local cache. The success of the network cache, however, is hampered by the fact that the ideal location, or optimal position, for the cache (or caches), is at the edge of the network infrastructure as close as possible to the end user. Thus, the optimal positioning of caches, near the edge of the network, inherently presents communication and coordination challenges.
Caching at the edges of the network, e.g., using many small caches at the network edges rather than a few large central caches at the center of the network, is further complicated by the fact that the small caches may have a limited cache community size. A limited or small cache community size means that there are few users using any one cache. A small cache community size is typically associated with a small number of request for information which makes it difficult, if not impossible, to mathematically achieve a high cache hit rate.
The cache hit rate is a mathematical term that expresses the number of hits encountered in the use of the cache per 100 requests for information. A high cache hit rate means that a high percentage of user requests are serviced by the cache. This means that the cache is working to reduce the load on the network. The cache hit rate, however, is dependent upon the number of users of the cache or members of the cache community. Thus, an engineering trade-off exists in the conventional cache design, i.e., a cache is more useful at improving latency at the edge of a network but the cache will, on average, have a lower hit rate because of the small cache community size.
Another problem in the conventional network is the level of general broadcasting that can be accomplished within the conventional architecture. As the Internet was established, the vast majority of network traffic was point to point in nature. In the present network, however, broadcast data on the network has surpassed other forms of traffic in terms of volume, but the network continues to have a point to point architecture which does not provide the physical medium or logical structure to implement broadcast within the network. The result is that the Internet is choking itself with replicated data, moving thousands of copies of the same data around at any given moment in time. The major difference now and when the network originated is the increased size of the transmission lines and switch capacity which are able to move more data. The IP network, however, is still using the same basic architecture as was found in the original system.
Another factor that effects network performance is that most of the data on the Internet is accessed infrequently. A small proportion of the data available on the Internet is receiving the majority of the inquiries or “hits” on the system.
There have been a number of attempts to improve network performance. One way of approaching the problem is by employing larger capacity storage equipment and/or faster communication equipment. This may provide faster network response time and/or ameliorate network congestion and delays in the short term. Indeed, the continuing availability of larger capacity and lower cost storage technology have made this a cost effective short term, however, stop gap, approach to network congestion. As discussed above, the rate of growth in the Internet's popularity may require equipment replacement before equipment costs can be recovered. Also, a number of United States Patents describe attempts to improve speed and storage capacity of interactive networks through a number of different methods—those patents include U.S. Pat. No. 5,442,771 issued to Robert Filepp et al. for a “Method For Storing Data In A Interactive Computer Network” and the patent issued to Ashar Aziz, U.S. Pat. No. 5,588,060 for a “Method And Apparatus For A Key Management Scheme For Internet Protocols.”
SUMMARY OF THE INVENTION
It is the goal of the present invention to address these short falls and problem areas to improve performance of the Internet. Thus, a first object of the present invention is to achieve real improvement in the performance over conventional caching system design through the use of a novel and nonobvious scheme to increase the local cache hit rates by employing methods and apparatus to improve the selection of data for storage in a local cache.
Another object of the present invention is a way to mesh a broadcast architecture into the point-to-point architecture of the Internet to enable the network to achieve the advantages of a broadcast architecture while maintaining the benefits of a point-to-point network.
Another object of the present invention is to combine the methods and apparatus for improved cache performance with the methods and apparatus used to mesh a broadcast architecture onto the point-to-point network architecture to achieve a complementary result.
Another object of the present invention is to extrapolate global demand for information into a tangible and practical solution to select data for storage into local cache devices thereby improving cache performance for caches with a small cache community size.
Another object of the present invention is the extrapolation of a statistically relevant sample from a list of requests for information that may modify a threshold of interest parameter for the selection of information into a local cache.
Another object of the present invention is to modify a threshold of interest in the selection of data of interest for input into a local cache based at least in part on historical interest in local demand for said data over a predetermined window of time.
Another object of the present invention is the employment of a proactive way to select data for input into a local cache in anticipation of network demand for said data of interest.
Another object of the present invention is the directed selection of information into particular local cache to achieve improvements in local cache performance.
Yet another object of the present invention is the deployment of a fee based broadcast service that improves local cache performance which in turn allows Internet service providers to achieve a greater return on investment in communication equipment and frees up network capacity to add additional Internet subscribers.
These and other objects of the present invention, as discussed in detail below, will become apparent to those skilled in the relevant art upon disclosure of the inventions and teachings contained herein.
A way to improve the Internet's performance is to improve the cache hit rate for at least some of the caches in the network. When a cache services the user's request for information, the network conserves capacity because an end-to-end connection is not required to service the request. A novel way to improve the selection of data for storage in a local cache is to determine the interest in the data on the network as a whole or as a sample determining the popularity as a whole. This may be accomplished by a system that measures the number of access requests for information and the type of information that were not available on the local caches. These can be called local cache miss information. The system may then examine the local cache miss information from some or all of the local sites and determine what information is of global interest to the Internet community. The system may then determine by a variety of ways discussed further below what information is a good selection for storage into local caches. Thus, the system provides a way to determine the selection of information for storage into a local cache from a pool of local cache miss information.
A second element that may improve the operation of the Internet is a broadcast system which takes the information or data that has been determined to be of sufficient interest that it is useful to input into local caches and broadcast that information and data to the local cache systems. This action may relieve the network from the identified problem of transporting replicated data and redundant information across network backbones. This high speed cache update or broadcast channel provides the network with fast relief from redundant data transport and will quickly reduce congestion across the entire Internet system.
The two methodologies of local cache sampling and broadcast cache updates complement and provide a synergistic solution to each others individual weaknesses thereby allowing the two technologies to blend into a single unique solution to the problems described herein. For the problem of multiple identical data elements traversing the Internet, caching represents a good solution but because of the tradeoff issue of small cache community sizes not providing high hit rates and the optimal positioning of the cache, caching is limited in its practical application. Satellite one-way broadcasting addresses this problem by, when combined with the data evaluation and selection that is described herein, aggregating cache community elements from all cache clients into one single cache community and thus allowing high hit rates to be achieved.
The use of satellite communications to provide a broadcast medium to the Internet may be accomplished by orbital satellites which allow a single signal to be sent up to a satellite and the resulting signal to be sent down to large geographic areas. A conventional satellite broadcast, however, settles from the fact that all users may not want to use the broadcast information at exactly the same time. The store and forward capability of a caches such that it accepts information and then store it for a time so that it can be used at times other than the exact time that it is broadcast, solves the major difficulty with satellite one-way broadcast.
This invention, inter alia, teaches a method for combining the capabilities of satellite communications and caching servers to overcome the disadvantages of each and, at the same time, improve the levels of hit rate that may be achieved by caching servers thereby saving bandwidth and other valuable resources within the Internet and other data networks which can use these technologies. This invention, inter alia, further teaches how to construct a selection system which uses one-way satellite communications in order to build a true broadcast capability as an addition to the existing point to point Internet network, and to use this broadcast capability to aggregate the cache community size, thus increasing the hit rates of caches on all caches which subscribe to the service without regard to a number of members of the individual cache server cache community size.
Thus, the present invention provides a complete comprehensive Internet broadcasting system that employs a caching system that is positioned close to the end user while still being part of the shared infrastructure and achieving a high cache hit rate. The system further provides a complete comprehensive Internet broadcasting system which seamlessly overlays a capability on the existing Internet that may allow a real broadcast so that the data or information can be transmitted once and received at the local caching systems.
This hybrid broadcast/cache architecture is very adaptable. Furthermore, the system is easy to install and readily available to all customers and Internet service providers. The system works with conventional cache systems, such as those available from Inktomi, Inc. and with conventional commercial satellite services such as GTE Spacenet or Hughes Satellite Systems.
Particularly, this invention, inter alia, teaches a method for implementing a comprehensive global information network broadcasting system, for use in overcoming inherent limitations in current global information network systems including the requirement for multiple copies of the same information or data being moved around the Internet to serve individual users along with the point to point nature of the infrastructure, comprising the steps of providing a master caching center for receiving information requests and sending out information and data; installing local caching systems for Internet service providers and customers sites; providing a satellite broadcast linking system to the local caching system for providing nearly instantaneous information from the master caching center to the local caching systems; disseminating a program for selecting data elements for storage in the local caching systems; and distributing data and information updates for the local caching systems as predetermined by the master caching center.
This invention, inter alia, also teaches a method of operating a comprehensive global information network broadcasting system, for use in overcoming inherent limitations in current global information network systems including the requirement for multiple copies of the same information or data being moved around the Internet to serve individual users along with the point to point nature of the infrastructure, comprising the steps of receiving a request for information or data from a customer to the local cache site; determining the location of the requested information or data among a number of location sources; notifying the master cache center of the lack of success in finding the requested data or information in the local cache system; analyzing the number of requests that the master cache center has received on a particular piece of information or data; retrieving the data or information from the Internet once the level of interest has been achieved; and sending the requested information or data through the satellite broadcasting system to all local cache sites once the data or information requests have reached a predetermined level.
This invention, inter alia, further teaches a comprehensive global information network broadcasting system, for use in overcoming inherent limitations in current global information network systems including the requirement for multiple copies of the same information or data being moved around the Internet to serve individual users along with the point to point nature of the infrastructure, comprising a master caching center for receiving information requests and sending out information and data; local cache systems positioned at customer and Internet service provider sites for sending out information and data requests and receiving and storing the information requested; means for connecting said master caching center with said local cache systems; and means for determining the level and interest in a particular piece of information or data and allowing the information and data to be sent from the master caching center to the local cache systems.