US 20040015580 A1
A method and apparatus for tracking and reporting traffic activity on a web site whereby cookie data is compiled at the visitor computer using cookie processing script embedded within the web page downloaded over the Internet and operable on the visitor computer. Data mining code within the downloaded web page is operable on the visitor computer to obtain web browsing data. The cookie processing script operates in consideration of this web browsing data and an old cookie previously stored on the visitor computer and associated with the visited web page to obtain new cookie values. These new cookie values are then stored on the visitor computer and also attached to an image request sent to a data collection server where they are processed and posted for viewing by the web page owner. As cookie processing and writing occurs completely within the visitor computer, cookie-blocking technologies are circumvented.
1. A method for tracking and reporting traffic activity on a web site comprising the steps of:
storing a web page on a first server coupled to a wide area network, said web page having web page code and data mining code including a cookie processing script;
uploading the web page to a visitor computer responsive to a request over the wide area network from the visitor computer;
operating the data mining code on the visitor computer to obtain web browsing data; and
operating the cookie processing script on the web browsing data to obtain new cookie values; and
storing the new cookie on the visitor computer including the new cookie values.
2. The method of
3. The method of
attaching the new cookie values to an image request associated with a designated URL source; and
sending the image request to the URL source.
4. The method of
5. The method of
compiling the web browsing data into a web page traffic report; and
posting the report for viewing over the wide area network.
6. The method of
7. The method of
8. The method of
detecting that an old cookie exists on the visitor computer associated with the web site;
tracking events on the visitor computer;
processing the old cookie using cookie processing code in view of the tracked events to obtain new cookie values; and
replacing the old cookie values with the new cookie values.
9. A method for analyzing activity on a web page of a web site comprising the steps of:
embedding data mining script within a web page;
embedding cookie processing script within the web page;
sending the web page to a client node;
operating the data mining script on the client node;
operating the cookie processing script on the client node; and
returning data resulting from the operation steps.
10. The method of
reading a cookie value from the client node;
tracking events on the client node;
processing cookie value based on the tracked events to obtain a new cookie value; and
writing a new cookie value to the client node.
11. The method of
embedding data within an image request associated with a designated URL source; and
sending the image request to the URL source.
12. The method of
compiling the data into a web page traffic report; and
posting the report for viewing over the wide area network.
13. An article comprising:
a computer—readable modulated carrier signal;
means embedded in the signal for mining data from a client node; and
means embedded in the signal for processing a cookie on the client node.
 This application claims the benefit from U.S. Provisional Patent Application No. 60/245,553 filed Nov. 2, 2000 whose contents are incorporated herein for all purposes.
 1. Field of the Invention
 The present application relates to compiling and reporting data associated with activity on a network server and more particularly to generating and processing cookies directly on a client node to report web traffic data from the client node to a server responsible for compiling such data.
 2. Description of the Prior Art
 Programs for analyzing traffic on a network server, such as a worldwide web server, are known in the art. One such prior art program is described in U.S. patent application Ser. No. 09/240,208, filed Jan. 29, 1999, owned by applicant for the present invention, for a Method and Apparatus for Evaluating Visitors to a Web Server, which is incorporated herein by reference for all purposes. In these prior art systems, the program typically runs on the web server that is being monitored. Data is compiled, and reports are generated on demand—or are delivered from time to time via email—to display information about web server activity, such as the most popular page by number of visits, peak hours of website activity, most popular entry page, etc.
 Each subscriber has a password to access a page on the service provider's server. This page includes a set of tables that summarize, in real time, activity on the customer's web site.
 Turning now to FIG. 1, indicated generally at 10 is a highly schematic view of a portion of the Internet implementing the present invention. Included thereon is a worldwide web server 12. Server 12, in the present example, is operated by a business that sells products via server 12, although the same implementation can be made for sales of services via the server. The server includes a plurality of pages that a site visitor can download to his or her computer, like computer 14, using a conventional browser program running on the computer. Examples of the type of pages that a visitor can download include informational pages and pages that describe the business and the products or services that are offered for sale.
 As mentioned above, it would be advantageous to the seller to have an understanding about how customers and potential customers use server 12. As also mentioned above, it is known to obtain this understanding by analyzing web-server log files at the server that supports the selling web site. It is also known in the art to collect data over the Internet and generate activity reports at a remote server.
 When the owner of server 12 first decides to utilize a remote service provider to generate such reports, he or she uses a computer 16, which is equipped with a web browser, to visit a web server 18 operated by the service provider. On server 18, the subscriber opens an account and creates a format for real-time reporting of activity on server 12.
 When the subscriber would like to see and print real-time statistics, the subscriber uses computer 16 to access server 18, which in turn is connected to database server 24 at the service provider's location. The owner can then see and print reports, like those available through the webtrendslive.com reporting service operated by the assignee of this application, that provide real-time information about the activity at server 12.
 The above-described arrangement for monitoring web server activity by a service provider over the Internet is generally known in the art. Information analyzed in prior art systems generally consists of what might be thought of as technical data, such as most popular pages, referring URLs, total number of visitors, returning visitors, etc.
 One known method for implementing this service is to load cookies on the computer of the visitor to the web page, where the cookies contain state information identifying that visitor (such as a unique visitor ID) and other information associated with that visitor (such as how many times the visitor has visited the particular web site). Despite the useful features that cookies provide to a user, there has been a recent backlash against using cookies as a perceived invasion of privacy. Modern web browsers now have a feature that allows a user to block all cookies and/or block cookies originating from third party web sites. This feature defeats the ability of web traffic analysis service providers from obtaining the information it needs to serve its customers.
 A method and apparatus is disclosed for setting cookie values from the client browser. Cookie values are read and written from the client browser and then sent to a processor on another computer. This process is used to avoid the alerts generated by web browsers when third-party (out of domain) cookies are accessed.
 The cookie is first read from a script that included in the web page code downloaded from a server coupled to the visitor computer over a wide area network such as the Internet. The same script then processes the data as fully as it can. Operation of the script then causes the computer to write new values back into the cookie and replace the old cookie values with the new values on the visitor computer hard drive. The script builds a string of all the data it has acquired and then passes it to a server by embedding the information into a request for an image.
 The foregoing and other objects, features and advantages of the invention will become more readily apparent from the following detailed description of a preferred embodiment of the invention that proceeds with reference to the accompanying drawings.
FIG. 1 is a schematic view of a portion of the Internet on which the invention is operated.
FIG. 2 is a block diagram illustrating the interaction between a web page server and a client node during web page request transactions according to methods known in the art.
FIG. 3 is a block diagram illustrating the interaction between a web page server, a client node, and a third party advertisement server during web page request transactions according to methods known in the art.
FIG. 4 is a block diagram illustrating the interaction between a web server, a client node, and a third party visitor tracking server during web page request transactions according to methods known in the art.
FIG. 5 is a block diagram illustrating the interaction between a web server, a client node, and a third party visitor tracking server during web page request transactions according to a preferred embodiment of the invention.
 APPENDIX A shows exemplary computer code used within a web page to implement the invention.
 The description first includes a technical description of cookies and how such are used in web sites to track visitors, and then proceeds with how the present invention operates to allow visitor tracking in view of current technology developed to block third-party cookies.
 What are Cookies?
 A cookie is a piece of text that a web server can store on a user's hard disk. Cookies allow a web site to store information on a user's machine and later retrieve it. The pieces of information are stored as “name-value pairs” comprised of, for instance, a variable name (e.g. UserID) and a value (e.g. A9A3BECE0563982D) associated with that variable name.
 Taking the web browser Microsoft Internet Explorer as an example, cookies are typically stored on a machine running Window 9x in a directory called c:\windows\cookies. The directory may list a vast number of name-value pairs, each associated with a particular domain from which they originated, representing all of the web sites that has placed a cookie on that particular computer. An example of a cookie file is shown below:
 UserID A9A3BECE0563982D www.goto.com/
 The cookie above is typical of the type stored on a visitor's computer (hereinafter the client node) when visiting the web site located at the domain goto.com. The name of the name-value pair is UserID, and the value is A9A3BECE0563982D. Both the name and value of the pair are generated according to an algorithm programmed in the cookie server associated with the domain web site. The first time the client node browses the goto.com web site, software on that web site assigns a unique ID number for each visitor and instructs the browser on the client node to store the name-value pair as a cookie in a designated folder where it can be retrieved later. The same name-value pair data is stored on the goto.com cookie server along with other information so that the visitor can be identified later.
 Cookies operate according to an industry standard called “Cookie RFC” (request for comment).
 A more complicated example of a cookie is shown below in reference to the eCommerce web site amazon.com. Visits to the amazon.com web site result in the storage of a more comprehensive set of information on the client node visiting the web site. The resulting cookie from such a visit is comprised of the following “crumbs”:
 Session-id-time 954242000 amazon.com/
 Session-id 002-4135256-7625846 amazon.com/
 x-main eKQIfwnxuF7qtmX52x6VWAXh@ih6Uo5H amazon.com/
 ubid-main 077-9263437-9645324 amazon.com/
 Each of these portions of the cookie, or “crumbs”, is associated with the amazon.com domain. Based on these crumbs, it appears that amazon.com stores a main user ID, an ID for each session, and the time the session started on the visitor computer (as well as an x-main value, which could be anything). While the vast majority of sites store just one piece of information—a user ID—on a visitor computer, there is really no limit to the amount of information such sites can store on the visitor computer in name-value pairs.
 How Does Cookie Data Move?
 A name-value pair is simply a named piece of data. It is not a program, and it cannot “do” anything. A web site can retrieve only the information that it has placed on the client node computer. It cannot retrieve information from other cookie files, or any other information from your machine.
 The data moves in the following manner. If one were to type the URL of a web site into a computer browser, the browser sends a request to the web site for the page. For example, if one were to type the URL http://www.amazon.com into the browser, the browser will contact Amazon's server and request its home page. When the browser does this, it will look on the requesting machine for a cookie file that Amazon has set. If it finds an Amazon cookie file, the browser will send all of the name-value pairs in the file to Amazon's server along with the URL. If it finds no cookie file, it will send no cookie data. Amazon's web server receives the cookie data and the request for a page. If name-value pairs are received, Amazon can use them.
 If no name-value pairs are received, Amazon knows that the visitor operating that computer has not visited before. The server creates a new ID for that visitor in Amazon's database and then sends name-value pairs to the computer in the header for the web page it sends. The computer stores the name-value pairs on its hard disk drive according to the Cookie RFC protocol.
 The web server can change name-value pairs or add new pairs whenever you visit the site and request a page.
 There are other pieces of information that the server can send with the name-value pair. One of these is an expiration date. Another is a path so that the site can associate different cookie values with different parts of the site.
 Cookies evolved because they solve a big problem for the people who implement web sites. In the broadest sense, a cookie allows a site to store state information on a visitor's computer. This information lets a web site remember what state the browser is in. An ID is one simple piece of state information—if an ID exists on the visiting computer, the site knows that the user has visited before. The state is, “Your browser has visited the site at least one time,” and the site knows the user ID from that visit.
 Sites can also store user preferences so that the site can look different for each visitor (often referred to as customization). For example, if one were to visit msn.com, it offers the visitor the ability to change content/layout/color. It also allows one to enter a zip code and get customized weather information. When the zip code is entered, the following name-value pair is an example of what might be added to MSN's cookie file:
 WEAT CC=NC%5FRaleigh%2DDurham®OION= www.msn.com/
 It is apparent from this name-value pair that the visitor is from Raleigh, N.C. Most sites seem to store preferences like this in the site's database and store nothing but an ID as a cookie, but storing the actual values in name-value pairs is another way to do it.
 ECommerce Sites can implement things like shopping carts and “quick checkout” options. The cookie contains an ID and lets the site keep track of a visitor as the visitor adds different things to his or her “shopping cart.” Each item added is stored in the site's database along with the visitor's ID value. When the visitor checks out, the site knows what is in his or her cart by retrieving all of the selections from the database associated with that user or session ID. It would be impossible to implement a convenient shopping mechanism without cookies or something like it.
 In all of these examples, note that what the database is able to store is things the visitor has selected from the site, pages viewed from the site, information given to the site in online forms, etc. All of the information is stored in the site's database, and a cookie containing your unique ID is all that is stored on the client node 14 (FIG. 1) in most cases.
 An illustration of this interaction between a visitor's computer (client node) and the web server is shown in FIG. 2. The web server 30 includes a web page server 32 that stores and distributes, on request, web pages associated with a designated domain. The web server also includes a cookie database 34 that stores information about individual visitors to the web pages served by the web page server as described above.
 The client node 36, shown in FIG. 2 as a computer of the web page visitor, includes components typical to computers such as a monitor 38, keyboard input device 40, a hard drive storage device 42 and a microprocessor 44. The client node also includes an input/output device capable of being connected to the Internet such as an analog modem (not shown). The hard drive has stored on it a browser software program 46 that runs on the microprocessor, and a set of cookie files 48 that are stored by operation of the instructions from the web server to the browser as described above.
 The client node 36 makes a request for a web page that is directed to the web page server 32. If a cookie associated with the same domain as the web page requested is stored on the client node hard drive, then that cookie is also sent with the request. The web server 30 receives the request for the web page and sends the requested web page back to the client node along with a new cookie that, as in the case of the amazon.com site, stores additional name-pair data within the client node cookie files 48. The same information is typically reflected within the cookie database 34 of the web server 30.
 A recent issue with cookies is the perceived invasion of privacy. Cookies allow sites to gather visitor information like never before. Certain infrastructure providers can actually create cookies that are visible on multiple sites. These providers typically fall into one of two categories: web advertisement services and web tracking services.
 The most famous of the former is DoubleClick, Inc. Many companies use DoubleClick to serve ad banners on their sites. Ad banners are typically graphic image files (GIF) located within the web page that display the advertisement. Code within the web site requests the image directly from the Ad provider's servers. This allows the Ad provider to load cookies on your computer. Ad providers like DoubleClick can then track your movements across multiple sites and thus form a very rich profile of the user at the client node. These profiles are still anonymous, but they are rich.
FIG. 3 illustrates how a client node receives web pages from a visited web site such as amazon.com but sends requests for and receives advertising images from a third party such as doubleclick.com. The client node 36 first sends a web page request in step (1) for a web page associated with a particular domain. The web server 30 associated with that domain receives the request and serves the web page back to the client node in step (2). As above, additional cookie data can pass between the client node and the web server. Located on that web page is code that calls for additional images (typically paid advertisements) stored at an advertisement server 50 (e.g. at the domain doubleclick.com) different than the web server 30. A cookie, placed on the client node in a previous visit, is sent to the ad provider 52 together with the request for the image in step (3). The cookie is analyzed and processed at the cookie database 54 of the ad provider and the ad server sends the advertisement image to the client node 36 in step (4) for display as part of the originally requested web page.
 A web tracking provider (illustrated in FIG. 4) operates on a similar principal but typically serves a passive role of collecting statistics and does not provide advertisement images for another entity's web site. Instead of serving an advertisement image, for instance, the web-tracking provider 56 provides new cookie information in step (4) to the client node 36. The image requested is typically only a 1×1 pixel image that is too small to be viewed by the naked eye and simply acts as a carrier on which tracking information is sent to the log analysis server 58 of the web-tracking provider. A new cookie is generated and sent to the client node. Visitor information is stored in a database 60 that can then be accessed by the web server 30 operator to see the popularity and demographics of the visitors to his or her web site.
 Recently, computer users have been concerned that profile information gathered by such firms as DoubleClick would be linked to name and address information. This has been perceived by many people as spying and has resulted in the implementation of several cookie-blocking techniques. The Microsoft Internet Explorer browser, for instance, has for many years included a feature whereby a user can elect to block access to all cookies at his or her machine (client node). Selecting such a feature eliminates all of the advantages that cookies provide such as personalized web content pages, storing of user preferences, etc.
 To allow a user to take advantage of cookies from web pages with which the visitor is directly interacting while still addressing privacy concerns, Microsoft has recently implemented a new feature in IE 5.5 that allows cookies from such sites to be used but blocks (or alerts the user to) third party cookies such as those from DoubleClick. The present invention, a preferred implementation of which is described below, is intended to circumvent this feature.
 The Invention
 Once the request for the web page is received at the web server, the web page and cookie generation script embedded within the web page are sent back to the client node 36 in step (2). As the browser on the client node runs the script of the web page to display it on the client node monitor, the additional script is implemented to search for a cookie, generate a new cookie in step (3), and then process the cookie in step (4) to extract and then send in step (5) the information embedded therein to the web tracking provider. The information reflects the data collected from the client node and web page visiting session. The web server operator may access databases within the web tracking provider server 58 to look-up traffic information for specific web sites in step (6).
 As the web page loads via the browser at the client node, the script accompanying the web page operates to:
 (2) If no cookie exists, then the script generates and then saves a cookie to the cookie file. An example of the subroutine script used for storing or writing the cookie (the function WCook) is shown below:
 (3) If a cookie exists, then the script reads the cookie stored on the client node, processes the cookie based on the values read and the new events that have occurred, and replaces the cookie with new values. An example of the subroutine script used for writing the new cookie values is shown below:
 (4) The processed information from the previous step is built as a string of all of the data that it has acquired (such as visitor sessions, path, browser type, screen resolution, eCommerce, time spent on page, etc.) and attaches the data to an image request made to the web infrastructure service provider (e.g. ad servers or visitor tracking services). An example of code used to implement this function is included below:
 By setting the source of the image to a variable built by the script (e.g. www.webtrendslive.com/button3.asp?id39786c45629t120045), all the gathered information can be passed to the web server doing the logging, e.g. data collection server 20 (FIG. 1). In this case, for instance, the variable script “id39786c45629t120045” is sent to a location such as incorporated within applicants' webtrendslive.com web site and is interpreted by a decoder program built into the data analysis server 22 to mean that a user with ID#39786, loaded client web site #45629 in 4.5 seconds and spent 1:20 minutes there before moving to another web site.
 A sample of the complete code used to implement the invention is included in Appendix A.
 An advantage of the present invention is that all cookie reading and rewriting processes take place on the client node and no cookies get sent over the Internet. Accordingly, important information about the client node can still be mined and sent to a third party site that can accumulate and analyze such information without being affected by the cookie blocking features of such modem browsers as IE 5.5.
 Having described and illustrated the principles of the invention in a preferred embodiment thereof, it should be apparent that the invention can be modified in arrangement and detail without departing from such principles. We claim all modifications and variation coming within the spirit and scope of the following claims.