|Publication number||US7966333 B1|
|Application number||US 12/204,709|
|Publication date||Jun 21, 2011|
|Filing date||Sep 4, 2008|
|Priority date||Jun 17, 2003|
|Publication number||12204709, 204709, US 7966333 B1, US 7966333B1, US-B1-7966333, US7966333 B1, US7966333B1|
|Inventors||Krishna Uppala, Umachandar Jayachandaran, Roman Basko, Stella Chan, Piali Choudary|
|Original Assignee||AudienceScience Inc.|
|Export Citation||BiBTeX, EndNote, RefMan|
|Patent Citations (57), Non-Patent Citations (5), Referenced by (3), Classifications (6), Legal Events (8)|
|External Links: USPTO, USPTO Assignment, Espacenet|
This patent application is a continuation of U.S. patent application Ser. No. 11/073,016, entitled “USER SEGMENT POPULATION TECHNIQUES,” filed Mar. 4, 2005, now abandoned which is a continuation application of U.S. patent application Ser. No. 10/870,553, entitled “USER SEGMENT POPULATION TECHNIQUES,” filed Jun. 17, 2004, now abandoned which claims priority to U.S. Provisional Application No. 60/479,609 entitled “USER SEGMENT POPULATION TECHNIQUES,” filed on Jun. 17, 2003 and which is related to U.S. patent application Ser. No. 10/870,688, filed Jun. 17, 2004, entitled “USER SEGMENTATION USER INTERFACE,” all of which applications are hereby incorporated by reference in their entirety.
The present invention is directed to the field of analytical techniques, and, more particularly, to the field of population segmentation techniques.
Web browsing is an increasingly common behavior. When a user browses a particular web site, the operator of that web site can collect significant volumes of information about the user's interaction with the web site. As one way of deriving value from such collected user data, a web site operator may wish to divide the users that visit its web site into groups, such that the members in each group share one or more significant characteristics or behaviors. This process of dividing users into groups is termed segmentation, and the groups are called segments. As one example, one segment of users for a sporting goods web site may be those users who visited any of the pages of a section of the web site devoted to fishing in the past 2 weeks.
User data is typically stored in a database, and can be extremely voluminous for a popular web site. The process of segmenting users based on user data is called segment population. In order to perform segment population, the operator of a web site must generally write custom code to manipulate the user data stored in the database to identify the members of each segment. This is time-consuming, requires the services of a skilled programmer, and can be quite expensive.
Also, because the set of segments to be populated and the tests used to identify the members of each segment in the set are typically embodied by the custom code, if a consumer of the segments wants to specify and populate a new segment after the original custom programming is complete, additional custom code must be written.
Further, it can be inefficient to analyze data within the database in order to perform segment population. This makes it expensive to use segments, and may limit the frequency with which segments can be repopulated, in turn limiting the currency of segment populations.
In view of the aforementioned shortcomings of conventional user segmentation techniques, a facility for automatically populating segments in an efficient way based upon plain-English segment definitions that can be prepared by users other than programmers would have significant utility.
A software facility for segmenting individuals who are members of a population, such as individuals visiting one or more web sites, (“the facility”) is described. In some embodiments, the facility provides tools that enable a user who is not a programmer to define new segments and establish the tests used to automatically identify individuals who are members of the segment; takes advantage of highly-efficient techniques for processing the large quantities of data required to populate segments; and/or provides a hierarchical organization of segments that can be used to more easily and intuitively for selecting segments, such as by selecting a segment whose data is to appear in a report.
In some embodiments, the facility provides tools that enable a user who is not a programmer to define new segments and establish the tests used to automatically identify individuals who are members of the segment. These tools enable a virtually any user to define a new segment by (1) selecting plain-language characterizations of conditions, also called “clauses” from a preexisting list of conditions; (2) specifying values for any variables contained by a selected clause; and (3) selecting logical operators to be used to combine the selected clauses into a membership test for the new segment. For example, to define a new segment for male users who registered in the past week, a user may (1) select plain languages characterizations of two clauses, “User property or characteristic” and “Users who performed an Event”; (2) in the User property or characteristic clause, select a “gender” value for a “attribute” variable and a “male” value for a “value” variable; and (3) select the AND logical operator to combine the User property or characteristic and Users who performed an Event clauses into a membership test for the segment.
In some embodiments, the facility takes advantage of highly-efficient techniques for processing the large quantities of data required to populate segments. These are based upon a “column-chunking” approach, where large database tables containing data used to populate segments—also called “fact tables”—are broken into small pieces called “column chunks,” which may be loaded into memory only a few at a time as part of the process of populating segments. In particular, column chunks may be simultaneously differentiated on up to four separate bases: (1) the identity of the table from which the data in the piece is taken; (2) the identity of the column of that table from which the data in the piece is taken; (3) a date to which the data relates; and (4) an arbitrary group of users to which a user to whom the data relates belongs, such as a group of users whose user identifiers all hash to the same hash values. Because segment membership is determined on a per-user basis, column chunks for different arbitrary groups of users may be loaded with complete mutual exclusivity, reducing the amount of memory needed accommodate simultaneously-loaded data in physical memory, and/or reducing performance degradation caused by paging virtual memory contents into and out of physical memory. Additional processing techniques include a resource request model, where segments are populated using recursive asynchronous requests.
In some embodiments, the facility provides a hierarchical organization of segments that can be used to more easily and intuitively for selecting segments, such as by selecting a segment whose data is to appear in a report. The facility provides various reports that employ segments, including reports on particular user characteristics or attributes that are filtered to include only the characteristics or attributes of users in one or more segments, and reports showing information about the populations of one or more segments.
The segmentation computer system 120 typically includes one or more central processing units (“CPUs”) 130 for executing computer programs; a computer memory 140 for storing programs and data—including data structures—while they are being used; and a persistent storage device 150, such as a hard drive, for persistently storing programs and data. In particular, the storage device contains database tables 151 that contain data used by the facility to perform segment population; a database engine 152 for accessing information in the database tables; the facility 153; segment definitions 154 created by users and/or implementers of the facility; a web server 155 for use in communicating with the client computer systems; clause modules 156 each containing program and/or higher-level logic for determining whether a particular clause is satisfied; a body of clause results 157 containing information indicating, for each of one or more clauses, the users that satisfy the clause; and segment results 158 containing information indicating, for each of one or more segments, which users are members of the segment. As will be apparent to one of ordinary skill in the art, the storage device and memory may have various other contents, not shown. Further, data may be transferred between persistent storage and memory, and between individual storage devices, for purposes such as optimizing the availability of particular data and safeguarding the persistence of particular data. Placing particular data in the storage device or memory is referred to herein as “storing” the data, while moving particular data from the storage device to the memory is referred to herein as “loading” the data.
While computer systems configured as described above are typically used to support the operation of the facility, one of ordinary skill in the art will appreciate that the facility may be implemented using devices of various types and configurations, and having various components.
The display further includes a description field 982, into which the user of the facility may enter a textual description for the segment. The display further includes a select location button 990 that the user of the facility may select in order to select a location within a hierarchy used to organize defined segments. This display further includes a Save Segment button 991 that the user of the facility may select to save this segment, and a Cancel button 992 that the user of the facility may select in order to cancel creation of this segment.
Line 1011 shows that the segment is identified by the number 5. Line 1012 shows that the name of the segment is “sample segment A.” Line 1013 shows that the description of the segment is “men who registered this week.” Line 1014 shows that the segment's population strategy is recurring snapshot. Lines 1016-1022 contain the definition of the segment. Line 1016 shows that the clause shown on lines 1017-1019 is combined with the clause shown on lines 1020-1022 using the Boolean AND operator. Lines 1017-1019 show that the first clause, clause 5.1, is a clause type 1—Users who performed an event—for which the following values are specified: have, Register, one week. Lines 1020-1022 show that the second clause, clause 5.2, is a clause type 4—User property or characteristic—for which the following values are specified: Gender, equal to, M.
The facility typically generates a set of column chunks for each column, each column chunk in the set having a different hash value.
The data flow begins with a request 3001 to populate segment 5 whose definition is shown in
The request received by the segment resource handler contains information identifying the segment to be populated, such as the segment identifier constituting a reference to the segment's definition 1010. When the segment resource handler receives the request, in some embodiments it checks a cache 3011 maintained for the segment resource handler to see if the cache contains a segment result for the requested segment that can be used to satisfy the request without having to populate the segment. Where the cache does not contain a segment result that can be used to satisfy request 3001, the segment resource handler uses the segment definition to identify all of the clauses utilized in the segment definition. In the example, the segment resource handler identifies clause 5.1, which appears on lines 1017-1019 of the definition for segment 5, and clause 5.2, which appears on lines 1020-1022. Accordingly, the segment resource handler submits two requests to the clause resource handler 3092: a request 3011 for clause 5.1, and a request 3021 for clause 5.2. As noted above, these requests may be submitted either directly to the clause resource handler, or through the generic resource handler. They may be submitted asynchronously.
When the clause resource handler 3092 receives request 3011, it checks its cache 3082 to determine whether this cache contains a clause result that can be used to satisfy request 3011. Where the cache does not contain a suitable clause result, the clause resource handler uses the clause definition from the segment definition, together with metadata for the clause type referenced in the clause definition, to identify a set of column chunks needed to identify users who satisfy the clause. The clause resource handler then submits a request 3012 for this column chunk set to a column chunk set resource handler 3093. In the example, the clause resource handler determines that the column chunk set needed to process the clause are those from table 1, columns 1 and 2, dates 1 through 7, all hash values. This requested column chunk set is hereafter denoted T1 C1,2 D1-7H*.
When the column chunk set resource handler receives request 3012, it checks its cache 3083 for column chunks among the requested set. For any column chunks not available in cache 3083, the column chunk resource handler retrieves these column chunks 3013 from the file system files 3094 where they were stored during the column chunking process. In the example, column chunks 2200, 2300, 2600, and 2700 are retrieved for hash value 1; column chunks 2400, 2500, 2800, and 2900 are retrieved for hash value 2; and additional column chunks (not shown) are retrieved for the remaining hash values shown in the hash columns for instances of table T1, such as hash column 1600 in
The column chunk set resource handler assembles retrieved column chunks 3013 into a column chunk set 3014 satisfying request 3012, and passes this column chunk set back to the clause resource handler. In step 3015, the clause resource handler invokes a routine that executes against the returned column chunk set 3014 in order to evaluate the clause for each user. The clause resource handler typically selects a routine optimized specifically for evaluating clauses of the same type as the requested clause. The clause resource handler passes the clause result 3016 generated in this manner back to the segment resource handler. In some embodiments, the clause result is a bit array containing a bit for each user that indicates whether or not the user satisfies the clause.
When the segment resource handler receives clause result 3016, it combines this clause result in 3031 with any other clauses defined within the segment definition for the requested segment. In the example, the result 3016 for clause 5.1 is combined with a clause result for clause 5.2.
When the clause resource handler receives request 3021 for clause 5.2, it determines that, because the clause depends upon user properties or characteristics contained in a single database table of manageable size, a result can be generated efficiently for this clause without reference to any column chunks by issuing a database query against the database table containing this user property or characteristic information. Accordingly, the clause resource handler issues a database query 3022 for this clause. The database query is directed to database engine 152. The database query is a query against table T3, shown in
It should be noted that a segment definition may include any number of clauses that the clause resource handler generates using column chunk sets, direct database queries, or other clause evaluation techniques. Clauses generated by the clause resource handler in any of these manners may be combined by the segment resource handler in order to generate a segment result.
While the flow diagrams discussed above depict their steps sequentially in a particular sequence and in a synchronous mode of execution for the sake of clarity, those skilled in the art will appreciate that these steps may be performed in a variety of a different manners. For instance, the steps may be performed in a different order, parallelized, performed asynchronously, etc.
The user of the facility may go on to select a new user type or segment from user type and segment list box 3720, and the report contents will instead be filtered by that user type or segment. The user of the facility may also modify date range fields to change the date range reflected by the report's contents, such as a frequency field 3741, and a date range field 3742. When the user of the facility enters values in these fields, the facility updates the report to reflect the newly-specified date range.
It will be appreciated by those skilled in the art that the above-described facility may be straightforwardly adapted or extended in various ways. While the foregoing description makes reference to preferred embodiments, the scope of the invention is defined solely by the claims that follow and the elements recited therein.
|Cited Patent||Filing date||Publication date||Applicant||Title|
|US5530939||Sep 29, 1994||Jun 25, 1996||Bell Communications Research, Inc.||Method and system for broadcasting and querying a database using a multi-function module|
|US5721831||Jun 3, 1994||Feb 24, 1998||Ncr Corporation||Method and apparatus for recording results of marketing activity in a database of a bank, and for searching the recorded results|
|US5742806||Jan 31, 1994||Apr 21, 1998||Sun Microsystems, Inc.||Apparatus and method for decomposing database queries for database management system including multiprocessor digital data processing system|
|US5870746||Oct 31, 1996||Feb 9, 1999||Ncr Corporation||System and method for segmenting a database based upon data attributes|
|US6003036||Feb 12, 1998||Dec 14, 1999||Martin; Michael W.||Interval-partitioning method for multidimensional data|
|US6078891||Nov 24, 1997||Jun 20, 2000||Riordan; John||Method and system for collecting and processing marketing data|
|US6112186||Mar 31, 1997||Aug 29, 2000||Microsoft Corporation||Distributed system for facilitating exchange of user information and opinion using automated collaborative filtering|
|US6374263 *||Jul 19, 1999||Apr 16, 2002||International Business Machines Corp.||System for maintaining precomputed views|
|US6377993 *||Sep 24, 1998||Apr 23, 2002||Mci Worldcom, Inc.||Integrated proxy interface for web based data management reports|
|US6430539||May 6, 1999||Aug 6, 2002||Hnc Software||Predictive modeling of consumer financial behavior|
|US6609131||Sep 27, 1999||Aug 19, 2003||Oracle International Corporation||Parallel partition-wise joins|
|US6615258 *||Sep 24, 1998||Sep 2, 2003||Worldcom, Inc.||Integrated customer interface for web based data management|
|US6629102 *||Jul 28, 2000||Sep 30, 2003||International Business Machines Corporation||Efficiently updating a key table during outline restructure of a multi-dimensional database|
|US6694322 *||Jun 29, 2001||Feb 17, 2004||Alphablox Corporation||Caching scheme for multi-dimensional data|
|US6785666 *||Jul 11, 2000||Aug 31, 2004||Revenue Science, Inc.||Method and system for parsing navigation information|
|US6839682||Oct 3, 2000||Jan 4, 2005||Fair Isaac Corporation||Predictive modeling of consumer financial behavior using supervised segmentation and nearest-neighbor matching|
|US6873981 *||Dec 16, 2002||Mar 29, 2005||Revenue Science, Inc.||Method and system for parsing navigation information|
|US6917972 *||Dec 5, 2001||Jul 12, 2005||Revenue Science, Inc.||Parsing navigation information to identify occurrences corresponding to defined categories|
|US6993529 *||Jun 1, 2001||Jan 31, 2006||Revenue Science, Inc.||Importing data using metadata|
|US7035925 *||Jun 8, 2005||Apr 25, 2006||Revenue Science, Inc.||Parsing navigation information to identify interactions based on the times of their occurrences|
|US7107338 *||Dec 5, 2001||Sep 12, 2006||Revenue Science, Inc.||Parsing navigation information to identify interactions based on the times of their occurrences|
|US7117193 *||Dec 5, 2001||Oct 3, 2006||Revenue Science, Inc.||Parsing navigation information to identify occurrences of events of interest|
|US7120666||Oct 30, 2002||Oct 10, 2006||Riverbed Technology, Inc.||Transaction accelerator for client-server communication systems|
|US7165037||Dec 14, 2004||Jan 16, 2007||Fair Isaac Corporation||Predictive modeling of consumer financial behavior using supervised segmentation and nearest-neighbor matching|
|US7188334||Nov 20, 2001||Mar 6, 2007||Ncr Corp.||Value-ordered primary index and row hash match scan|
|US7231612||Feb 21, 2003||Jun 12, 2007||Verizon Laboratories Inc.||Computer-executable method for improving understanding of business data by interactive rule manipulation|
|US7464122 *||Jul 27, 2006||Dec 9, 2008||Revenue Science, Inc.||Parsing navigation information to identify occurrences of events of interest|
|US7493312||May 24, 2004||Feb 17, 2009||Microsoft Corporation||Media agent|
|US7676467 *||Mar 9, 2010||AudienceScience Inc.||User segment population techniques|
|US20020082901||Apr 30, 2001||Jun 27, 2002||Dunning Ted E.||Relationship discovery engine|
|US20020095421 *||Dec 13, 2000||Jul 18, 2002||Koskas Elie Ouzi||Methods of organizing data and processing queries in a database system, and database system and software product for implementing such methods|
|US20020099691||Jun 24, 1998||Jul 25, 2002||Michael Dean Lore||Method and apparatus for aggregation of data in a database management system|
|US20030014304||Jul 10, 2001||Jan 16, 2003||Avenue A, Inc.||Method of analyzing internet advertising effects|
|US20030028509||Aug 6, 2001||Feb 6, 2003||Adam Sah||Storage of row-column data|
|US20030074348||Oct 16, 2001||Apr 17, 2003||Ncr Corporation||Partitioned database system|
|US20030101451||Jan 9, 2002||May 29, 2003||Isaac Bentolila||System, method, and software application for targeted advertising via behavioral model clustering, and preference programming based on behavioral model clusters|
|US20030163438||Jan 16, 2001||Aug 28, 2003||General Electric Company||Delegated administration of information in a database directory using at least one arbitrary group of users|
|US20030204447||May 9, 2002||Oct 30, 2003||Dalzell Richard L.||Metadata service that supports user-to-user sales via third party web pages|
|US20030216966||Apr 3, 2003||Nov 20, 2003||Javier Saenz||Information processing system for targeted marketing and customer relationship management|
|US20040088376||Oct 30, 2002||May 6, 2004||Nbt Technology, Inc.||Transaction accelerator for client-server communication systems|
|US20040172400 *||Jan 20, 2004||Sep 2, 2004||Rony Zarom||Using associative memory to perform database operations|
|US20040181554||Mar 24, 2004||Sep 16, 2004||Heckerman David E.||Apparatus and accompanying methods for visualizing clusters of data and hierarchical cluster classifications|
|US20040230947||Feb 25, 2004||Nov 18, 2004||Bales Christopher E.||Systems and methods for personalizing a portal|
|US20050015571||May 29, 2003||Jan 20, 2005||International Business Machines Corporation||System and method for automatically segmenting and populating a distributed computing problem|
|US20050086243||Nov 8, 2004||Apr 21, 2005||Tangis Corporation||Logging and analyzing computer user's context data|
|US20050159996||Dec 14, 2004||Jul 21, 2005||Lazarus Michael A.||Predictive modeling of consumer financial behavior using supervised segmentation and nearest-neighbor matching|
|US20050203888||Jun 7, 2004||Sep 15, 2005||Iron Mountain Incorporated||Method and apparatus for improved relevance of search results|
|US20050240468||Apr 29, 2005||Oct 27, 2005||Claritas, Inc.||Method and apparatus for population segmentation|
|US20060041548||Jul 22, 2005||Feb 23, 2006||Jeffrey Parsons||System and method for estimating user ratings from user behavior and providing recommendations|
|US20060069719||Sep 15, 2005||Mar 30, 2006||Riverbed Technology, Inc.||Transaction accelerator for client-server communication systems|
|US20060112222||Nov 4, 2005||May 25, 2006||Barrall Geoffrey S||Dynamically expandable and contractible fault-tolerant storage system permitting variously sized storage devices and method|
|US20060155605||Sep 10, 2003||Jul 13, 2006||Peter Haighton||Rich media personal selling system|
|US20060190333||Feb 21, 2006||Aug 24, 2006||Justin Choi||Brand monitoring and marketing system|
|US20060235764||Jun 15, 2006||Oct 19, 2006||Alticor Investments, Inc.||Electronic commerce transactions within a marketing system that may contain a membership buying opportunity|
|US20060277585||Dec 20, 2005||Dec 7, 2006||Error Christopher R||Creation of segmentation definitions|
|US20080097822||Oct 11, 2005||Apr 24, 2008||Timothy Schigel||System And Method For Facilitating Network Connectivity Based On User Characteristics|
|US20080189232||Jan 31, 2008||Aug 7, 2008||Veoh Networks, Inc.||Indicator-based recommendation system|
|1||"H", Microsoft Computer Dictionary, Fifth Edition. Published May 2002.|
|2||*||Maurer, W. D. And Lewis, T. G. 1975. Hash Table Methods. ACM Comput. Surv. 7, 1 (Mar. 1975), 5-19. Doi= http://doi.acm.org/10.1145/356643.356645.|
|3||Non-Final Office Action to U.S. Appl. No. 10/870,688, mailed Sep. 22. 2008; 13 pages|
|4||U.S. Appl. No. 09/613,403, filed Jul. 11, 2000, Subramanian.|
|5||U.S. Appl. No. 10/870,688, filed Jun. 17, 2004, Kumar et al.|
|Citing Patent||Filing date||Publication date||Applicant||Title|
|US9400824 *||Mar 19, 2014||Jul 26, 2016||Google Inc.||Systems and methods for sorting data|
|US20100306029 *||Dec 2, 2010||Ryan Jolley||Cardholder Clusters|
|US20150213023 *||Mar 19, 2014||Jul 30, 2015||Google, Inc.||Systems and methods for sorting data|
|Cooperative Classification||G06F17/30598, G06F17/30522|
|European Classification||G06F17/30S4P7, G06F17/30S8R1|
|Mar 17, 2009||AS||Assignment|
Owner name: REVENUE SCIENCE, INC., WASHINGTON
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:UPPALA, KRISHNA;JAYACHANDARAN, UMACHANDAR;BASKO, ROMAN;AND OTHERS;SIGNING DATES FROM 20050307 TO 20050308;REEL/FRAME:022405/0728
|Mar 2, 2010||AS||Assignment|
Owner name: SILICON VALLEY BANK, CALIFORNIA
Free format text: SECURITY AGREEMENT;ASSIGNOR:AUDIENCESCIENCE INC.;REEL/FRAME:024096/0172
Effective date: 20100217
|Mar 19, 2010||AS||Assignment|
Owner name: AUDIENCESCIENCE INC., WASHINGTON
Free format text: CHANGE OF NAME;ASSIGNOR:REVENUE SCIENCE, INC.;REEL/FRAME:024111/0001
Effective date: 20090122
|Oct 12, 2011||AS||Assignment|
Owner name: GOLD HILL CAPITAL 2008, LP, CALIFORNIA
Free format text: SECURITY AGREEMENT;ASSIGNOR:AUDIENCESCIENCE, INC.;REEL/FRAME:027047/0780
Effective date: 20111011
|Nov 21, 2011||AS||Assignment|
Owner name: SILICON VALLEY BANK, CALIFORNIA
Free format text: SECURITY AGREEMENT;ASSIGNOR:AUDIENCESCIENCE INC.;REEL/FRAME:027256/0756
Effective date: 20111116
|Nov 19, 2014||FPAY||Fee payment|
Year of fee payment: 4
|Sep 11, 2015||AS||Assignment|
Owner name: AUDIENCESCIENCE, INC., WASHINGTON
Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:GOLD HILL CAPITAL 2008, LP;REEL/FRAME:036587/0489
Effective date: 20150911
|Sep 25, 2015||AS||Assignment|
Owner name: ORIX VENTURES, LLC, TEXAS
Free format text: SECURITY INTEREST;ASSIGNOR:AUDIENCESCIENCE INC.;REEL/FRAME:036654/0640
Effective date: 20150922