Search Images Maps Play YouTube News Gmail Drive More »
Sign in
Screen reader users: click this link for accessible mode. Accessible mode has the same essential features but works better with your reader.

Patents

  1. Advanced Patent Search
Publication numberUS20030093423 A1
Publication typeApplication
Application numberUS 09/921,230
Publication dateMay 15, 2003
Filing dateAug 1, 2001
Priority dateMay 7, 2001
Also published asUS6978266, US7359899, US20050278363
Publication number09921230, 921230, US 2003/0093423 A1, US 2003/093423 A1, US 20030093423 A1, US 20030093423A1, US 2003093423 A1, US 2003093423A1, US-A1-20030093423, US-A1-2003093423, US2003/0093423A1, US2003/093423A1, US20030093423 A1, US20030093423A1, US2003093423 A1, US2003093423A1
InventorsJohn Larason, Alan Packer
Original AssigneeLarason John Todd, Packer Alan J.
Export CitationBiBTeX, EndNote, RefMan
External Links: USPTO, USPTO Assignment, Espacenet
Determining a rating for a collection of documents
US 20030093423 A1
Abstract
On one or more data processing systems, a collection rating is determined for a rating scale for contents of a document collection. A link rating is determined for the rating scale for contents linked to or linked by contents of the document collection. The collection rating for the rating scale for contents of the document collection is then modified, based on the determined link rating for the rating scale for contents linked to or linked by contents of the document collection.
Images(4)
Previous page
Next page
Claims(38)
What is claimed is:
1. A method of operation on one or more data processing machines, the method comprising:
determining a first collection rating for a first rating scale for contents of a first document collection;
determining a first link rating for said first rating scale for contents linked to or linked by contents of said first document collection; and
modifying said first collection rating for said first rating scale for contents of said first document collection based on said determined first link rating for said first rating scale for contents linked to or linked by contents of said first document collection.
2. The method of claim 1, wherein said determining of a first collection rating comprises determining said first collection rating based on document ratings of a first subset of documents of said first collection of documents, and sizes of the documents of the first subset of documents of the first document collection.
3. The method of claim 2, wherein said first subset of documents of said first document collection consists of first textual documents of said first document collection.
4. The method of claim 1, wherein said determining of a first link rating comprises determining at least a second collection rating for at least a second document collection with documents linked to or linked by documents of said first document collection, and determining said first link rating based on said determined at least a second collection rating of said at least a second document collection.
5. The method of claim 1, wherein said modifying of the first collection rating comprises replacing the determined first collection rating with said determined first link rating.
6. The method of claim 1, wherein said modifying of the first collection rating comprises adding said determined first link rating to the determined first collection rating.
7. The method of claim 1, wherein said modifying of the first collection rating comprises subtracting said determined first link rating from the determined first collection rating.
8. The method of claim 1, wherein said first document collection is a web site, and said contents of said first document collection are web pages.
9. A method of operation on one or more data processing machines, the method comprising:
determining document ratings for a rating scale for a subset of documents of a document collection;
determining sizes of the documents of said subset;
determining a collection rating for said rating scale for said document collection based on said determined document ratings of said subset of documents, and normalized by said determined sizes of said subset of documents.
10. The method of claim 9, wherein said determining of the collection rating comprises further subdividing said subset of documents into a plurality of groups in accordance with their determined sizes, and applying a weight to the document rating determined for said rating scale for each document of the subset in accordance to the document's size group classification.
11. The method of claim 10, wherein weights are applied to said determined document ratings for said rating scale as follows:
Document size range in (bytes) Weight <500  1 500-999  4 1000-4999  7 5000-9999 10 >9999  13
12. The method of claim 9, wherein said determining of the collection rating comprises further subdividing said subset of documents into a plurality of groups in accordance with their determined ratings for said rating scale, and applying a weight to the document rating determined for said rating scale for each document of the subset in accordance to the document's rate group classification.
13. The method of claim 12, wherein weights are applied to said determined document ratings for said rating scale as follows:
Determined document rating for said rating scale Weight 0 −0.5 1 0.5 2 3 3 6
14. The method of claim 9, wherein said determining of the collection rating comprises computing the collection rating for said rating scale as follows:
CR = i , j r i w j log ( N ij + 1 ) i , j w j log ( N ij + 1 )
where CR is the collection rating for said rating scale;
ri is the weight applied for document rating group i;
wi is the weight applied for document size group j;
Nij is the number of pages in the collection with document rating i and having group sizes j for said rating scale.
15. The method of claim 9, wherein said first collection of documents are web pages of a web site, and said first subset of documents are textual documents of said web site.
16. A method of operation on one or more data processing machines, the method comprising:
determining whether a first document collection comprises at least one document linked to at least one other document of at least one other second document collection;
determining a collection rating for a rating scale for each of said at least one other second document collection if said first document collection is determined to comprise at least one document linked to at least one other document of at least one other second document collection;
determining whether said first document collection comprises at least one document being linked by at least one other document of at least one other third document collection;
determining a collection rating for said rating scale for each of said at least one other third document collection if said first document collection is determined to comprise at least one document linked by at least one other third document collection; and
determining a link rating for said rating scale for said first document collection based on either said determined collection rating or ratings for said rating scale for said at least one other second document collection, or said determined collection rating or ratings for said rating scale for said at least one other third document collection, or both, depending on whether collection rating or ratings are determined for said rating scale for said at least one other second document collection, said at least one other third document collection or both.
17. The method of claim 16, wherein each of said determining of a collection rating for said rating scale for each of said at least one other second or third document collection comprises determining document ratings for said rating scale for documents of the particular document collection, and sizes of the documents, and determining the collection rating for the particular document collection based on the determined document ratings and the determined sizes.
18. The method of claim 16, wherein said determining of a link rating comprises summing said collection rating or ratings determined for said rating scale for said at least one other second or third document collection, and determining the link rating based on the result of said summing.
19. The method of claim 18, wherein said determining of the link rating based on the result of said summing comprises determining the link rating based on the result of said summing as follows:
The result of said summing (RS) link rating RS less than −2 −1.0   RS greater than or equal to −2, −0.5   but less than −1 RS greater than or equal to −1, 0   but less than or equal to −0.5 RS greater than −0.5, but less 0.5 than or equal to 1.5 RS greater than 1.5, but less 1.0 than or equal to 3 RS greater than 3, but less than 1.5 or equal to 4 RS greater than 4 2.0
20. An apparatus comprising:
storage medium having stored therein a plurality of programming instructions designed to enable said apparatus to
determine a first collection rating for a first rating scale for contents of a first document collection,
determine a first link rating for said first rating scale for contents linked to or linked by contents of said first document collection, and
modify said first collection rating for said first rating scale for contents of said first document collection based on said determined first link rating for said first rating scale for contents linked to or linked by contents of said first document collection; and
at least one processor coupled to the storage medium to execute the programming instructions.
21. The apparatus of claim 20, wherein said programming instructions are designed to enable the apparatus to perform said determining of a first collection rating by determining said first collection rating based on document ratings of a first subset of documents of said first collection of documents, and sizes of the documents of the first subset of documents of the first document collection.
22. The apparatus of claim 21, wherein said first subset of documents of said first document collection consists of first textual documents of said first document collection.
23. The apparatus of claim 20, wherein said programming instructions are designed to enable the apparatus to perform said determining of a first link rating by determining at least a second collection rating for at least a second document collection with documents linked to or linked by documents of said first document collection, and determining said first link rating based on said determined at least a second collection rating of said at least a second document collection.
24. The apparatus of claim 20, wherein said programming instructions are designed to enable the apparatus to perform said modifying of the first collection rating by replacing the determined first collection rating with said determined first link rating.
25. The apparatus of claim 20, wherein said programming instructions are designed to enable the apparatus to perform said modifying of the first collection rating by adding said determined first link rating to the determined first collection rating.
26. The apparatus of claim 20, wherein said programming instructions are designed to enable the apparatus to perform said modifying of the first collection rating by subtracting said determined first link rating from the determined first collection rating.
27. The apparatus of claim 20, wherein said first document collection is a web site, and said contents of said first document collection are web pages.
28. An apparatus comprising:
storage medium having stored therein a plurality of programming instructions designed to enable said apparatus to
determine document ratings for a rating scale for a subset of documents of a document collection,
determine sizes of the documents of said subset,
determine a collection rating for said rating scale for said document collection based on said determined document ratings of said subset of documents, and normalized by said determined sizes of said subset of documents; and
at least one processor coupled to the storage medium to execute the programming instructions.
29. The apparatus of claim 28, wherein said programming instructions are designed to enable the apparatus to perform said determining of the collection rating by further subdividing said subset of documents into a plurality of groups in accordance with their determined sizes, and applying a weight to the document rating determined for said rating scale for each document of the subset in accordance to the document's size group classification.
30. The apparatus of claim 29, wherein said programming instructions are designed to enable the apparatus to apply weights to said determined document ratings for said rating scale as follows:
Document size range in (bytes) Weight <500  1 500-999  4 1000-4999  7 5000-9999 10 >9999  13
31. The apparatus of claim 28, wherein said programming instructions are designed to enable the apparatus to perform said determining of the collection rating by further subdividing said subset of documents into a plurality of groups in accordance with their determined ratings for said rating scale, and applying a weight to the document rating determined for said rating scale for each document of the subset in accordance to the document's rate group classification.
32. The apparatus of claim 31, wherein said programming instructions are designed to enable the apparatus to apply weights to said determined document ratings for said rating scale as follows:
Determined document rating for said rating scale Weight 0 −0.5 1 0.5 2 3 3 6
33. The apparatus of claim 28, wherein said programming instructions are designed to enable the apparatus to perform said determining of the collection rating by computing the collection rating for said rating scale as follows:
CR = i , j r i w j log ( N ij + 1 ) i , j w j log ( N ij + 1 )
where CR is the collection rating for said rating scale;
ri is the weight applied for document rating group i;
wi is the weight applied for document size group j;
Nij is the number of pages in the collection with document rating i and having group sizes j for said rating scale.
34. The apparatus of claim 28, wherein said first collection of documents are web pages of a web site, and said first subset of documents are textual documents of said web site.
35. An apparatus comprising:
storage medium having stored therein a plurality of programming instructions designed to enable said apparatus to
determine whether a first document collection comprises at least one document linked to at least one other document of at least one other second document collection,
determine a collection rating for a rating scale for each of said at least one other second document collection if said first document collection is determined to comprise at least one document linked to at least one other document of at least one other second document collection,
determine whether said first document collection comprises at least one document being linked by at least one other document of at least one other third document collection,
determine a collection rating for said rating scale for each of said at least one other third document collection if said first document collection is determined to comprise at least one document linked by at least one other third document collection, and
determine a link rating for said rating scale for said first document collection based on either said determined collection rating or ratings for said rating scale for said at least one other second document collection, or said determined collection rating or ratings for said rating scale for said at least one other third document collection, or both, depending on whether collection rating or ratings are determined for said rating scale for said at least one other second document collection, said at least one other third document collection or both; and
at least one processor coupled to the storage medium to execute the programming instructions.
36. The apparatus of claim 35, wherein said programming instructions are designed to enable the apparatus to perform each of said determining of a collection rating for said rating scale for each of said at least one other second or third document collection by determining document ratings for said rating scale for documents of the particular document collection, and sizes of the documents, and determining the collection rating for the particular document collection based on the determined document ratings and the determined sizes.
37. The apparatus of claim 35, wherein said programming instructions are designed to enable the apparatus to perform said determining of a link rating by summing said collection rating or ratings determined for said rating scale for said at least one other second or third document collection, and determining the link rating based on the result of said summing.
38. The apparatus of claim 37, wherein said programming instructions are designed to enable the apparatus to perform said determining of the link rating based on the result of said summing by determining the link rating based on the result of said summing as follows:
The result of said summing (RS) link rating RS less than −2 −1.0   RS greater than or equal to −2, −0.5   but less than −1 RS greater than or equal to −1, 0   but less than or equal to −0.5 RS greater than −0.5, but less 0.5 than or equal to 1.5 RS greater than 1.5, but less 1.0 than or equal to 3 RS greater than 3, but less than 1.5 or equal to 4 RS greater than 4 2.0
Description
  • [0001]
    This application claims priority to provisional application Nos. 60/289,587, 60/289,400 and 60/289,418, all filed on May 7, 2001, entitled “Method of Assigning Ratings to Collections of Related Objects”, “Method and Apparatus for Automatically Determining Salient Features for Object Classification” and “Vvery-Large-Scale Automatic Categorizer For Web Content” respectively having at least partial common inventorship as the present application.
  • BACKGROUND OF THE INVENTION
  • [0002]
    1. Field of the Invention
  • [0003]
    The present invention relates to the field of data processing. More specifically, the present invention relates to automated methods and systems for determining a rating for a rating scale for a collection of documents.
  • [0004]
    2. Background Information
  • [0005]
    The World Wide Web (WWW) is an expanding collection of textual and non-textual material which is available for access to any Internet user, from any location at any time. Some users find particular contents to be objectionable. For example, parents often wish to shield their children from exposure to sexually explicit material, hate speech, and drug information. Similarly, companies may wish to prevent access by employees to web sites that provide or support gambling.
  • [0006]
    Notwithstanding the civil liberty implications associated with these concerns, a number of groups and companies have brought forward systems and techniques for assisting Internet users in block accessing to undesired content. For example, various blocking software products are available from software vendors, such as SafeSurf of Newbury Park, Calif., and NetNanny of Bellevue, Wash. Typically, these products employ site lists to effectuate blocking of access to undesired contents. These site lists include the identifications of the web sites containing undesired contents. Access to any of the web pages hosted by the identified web sites is blocked. Another example of such a system is described by Neilsen et al., “Selective downloading of file types contained in hypertext documents transmitted in a computer controlled network”, U.S. Pat. No. 6,098,102, which utilizes the file extensions of URLs to determine whether the particular files will or will not be downloaded to the user. Still another method for controlling access to web sites is typified by the work of the Internet Content Rating Association, which uses the technology of the Platform for Internet Content Selection (PICS) specification to allow voluntary, or in the future potentially mandatory, rating of page content by the content author. Filtering can then be done by utilizing these rating “tags”, and may be augmented by a complete block on other un-rated pages.
  • [0007]
    These prior art approaches suffer from at least the following disadvantages:
  • [0008]
    a) The WWW is constantly growing. The number of web sites and their contents are constantly changing. As a result, the prior art approaches are unable to keep pace with the changes.
  • [0009]
    b) Further, many web sites generate user-specific pages at every access. As a result, the prior art URL based approaches are unable to facilitate blocking of these dynamically generated pages if they contain undesired contents.
  • [0010]
    c) Additionally, content providers are often not the best, or even the appropriate, agent for rating their own contents. Duplicitous providers may deliberately mis-rate the appropriateness of their contents.
  • [0011]
    Some filtering systems rely on key word lists or text analysis, to judge the content of individual pages. While these systems may work satisfactorily on text files, they are ineffective for non-text materials, such as images, sound files, or movies.
  • [0012]
    Thus, an improved approach for blocking undesired contents is desired.
  • SUMMARY OF THE INVENTION
  • [0013]
    On one or more data processing systems, a collection rating is determined for a rating scale for contents of a document collection. A link rating is determined for the rating scale for contents linked to or linked by contents of the document collection. The collection rating for the rating scale for contents of the document collection is then modified, based on the determined link rating for the rating scale for contents linked to or linked by contents of the document collection.
  • [0014]
    In one embodiment, a collection rating for a rating scale for a document collection is determined based on document ratings of a subset of the documents of the document collection, and their sizes.
  • [0015]
    In one embodiment, the link rating for the rating scale for the document collection is determined based on the collection ratings of the document collections having contents linked to or linked by contents of the document collection.
  • [0016]
    In one embodiment, the document collection is a web site, the documents of the document collection are web pages of the web site, and the subset of documents employed to determine the web site rating is the textual documents.
  • [0017]
    Note: The term “document” as used herein in this application, including the specification and the claims, includes textual as well as non-textual documents, unless one or more types of “documents” are expressly excluded or implicitly excluded in view of the context of the usage.
  • BRIEF DESCRIPTION OF DRAWINGS
  • [0018]
    The present invention will be described by way of exemplary embodiments, but not limitations, illustrated in the accompanying drawings in which like references denote similar elements, and in which:
  • [0019]
    [0019]FIG. 1 illustrates an overview of the present invention in accordance with one embodiment;
  • [0020]
    [0020]FIG. 2 illustrates a method view of the present invention, in accordance with one embodiment;
  • [0021]
    [0021]FIG. 3 illustrates the operational flow for determining a collection rating, in accordance with one embodiment;
  • [0022]
    [0022]FIG. 4 illustrates the operational flow for determining a link rating, in accordance with one embodiment; and
  • [0023]
    [0023]FIG. 5 illustrates a computer system suitable for use to practice the present invention, in accordance with one embodiment.
  • [0024]
    Glossary
  • [0025]
    URL—Uniform Resource Locator
  • DETAILED DESCRIPTION OF THE INVENTION
  • [0026]
    As summarized earlier, the present invention includes improved methods and related apparatuses for determining a rating for a rating scale for a document collection. In the description to follow, various aspects of the present invention will be described. However, the present invention may be practiced with only some or all aspects of the present invention. For purposes of explanation, specific numbers, materials and configurations are set forth in order to provide a thorough understanding of the present invention. However, the present invention may be practiced without some of the specific details. In other instances, well known features are omitted or simplified in order not to obscure the present invention.
  • [0027]
    Parts of the description will be presented in terms of operations performed by a processor based device, using terms such as data, analyzing, assigning, selecting, determining, and the like, consistent with the manner commonly employed by those skilled in the art to convey the substance of their work to others skilled in the art. As well understood by those skilled in the art, the quantities take the form of electrical, magnetic, or optical signals capable of being stored, transferred, combined, and otherwise manipulated through mechanical and electrical components of the processor based device. The term “processor” includes microprocessors, micro-controllers, digital signal processors, and the like, that are standalone, adjunct or embedded.
  • [0028]
    Various operations will be described as multiple discrete steps in turn, in a manner that is most helpful in understanding the present invention. However, the order of description should not be construed as to imply that these operations are necessarily order dependent. In particular, these operations need not be performed in the order of presentation. Further, the description repeatedly uses the phrase “in one embodiment”, which ordinarily does not refer to the same embodiment, although it may.
  • Overview
  • [0029]
    Referring now to FIG. 1, wherein a block diagram illustrating an overview of the present invention, in accordance with one embodiment, is shown. As illustrated, collection rater 110 of the present invention, is equipped to deduce a collection rating 112 for a rating scale for a document collection, such as collection 102. An example of a rating scale is a scale that quantitatively rates the contents of a subject collection on its “offensiveness”, e.g. ranging from 0 to 3, with 0 meaning “not offensive”, 1 meaning “mildly offensive”, 2 meaning “moderately offensive” and 3 meaning “very offensive”. As will be described in more detail below, collection rater 110 advantageously generates collection rating 112 for a collection taking in account not only the contents of the collection, but also contents of other collections linked to or linked by contents of the subject collection, such as collection 104 and collection 106 respectively. As those skilled in the art would appreciate, the inclusion of the contents linked to or linked by contents of the subject collection tends to strengthen the accuracy of the rating generated for the subject collection.
  • [0030]
    In one embodiment, collections 102, 104 and 106 are web sites, and documents 103, 105 and 107 are web pages of the web sites, including textual as well as non-textual, such as multi-media, web pages. In alternate embodiments, documents 103, 105 and 107 may be other content objects, with collections 102, 104 and 106 being other organizational entities of the content objects.
  • Method
  • [0031]
    Referring now to FIG. 2, wherein a block diagram illustrating a method view of the present invention, in accordance with one embodiment, is shown. As illustrated, for the embodiment, collection rater 110 generates a collection rating for rating scale for a subject collection, by first determining an initial collection rating for the contents of the subject collection, block 202. Upon so determining, collection rater 110 determines a link rating for the contents of the linked collections, i.e. collections with contents linked to or linked by contents of the subject collection, block 204. Thereafter, for the illustrated embodiment, collection rater 110 modifies the initially determined collection rating, using the determined link rating, thereby taking into consideration the “linked” contents, block 206.
  • [0032]
    In one embodiment, in block 206, collection rater 110 modifies the initially determined collection rating by replacing the initially determined collection rating with the determined link rating. In another embodiment, in block 206, collection rater 110 modifies the initially determined collection rating by adding the determined link rating to the initially determined collection rating. In yet another embodiment, in block 206, collection rater 110 modifies the initially determined collection rating by subtracting the determined link rating from the initially determined collection rating. In yet other embodiments, in block 206, collection rater 110 may modify the initially determined collection rating by combining the determined link rating with the initially determined collection rating in other alternate manners.
  • [0033]
    The manner in which the determined link rating is to be combined with the initially determined collection rating to modify the initially determined collection rating to take into account the linked contents is application dependent. Preferably, the manner of combination is user configurable. Such user configuration may be facilitated through any one of a number of user configuration techniques known in the art, which are all within the abilities of those ordinarily skilled in the art. Accordingly, no further description of these user configuration techniques is necessary.
  • Collection Rating
  • [0034]
    Referring now to FIG. 3, wherein a block diagram illustrating a manner collection rater 110 generates a collection rating for a rating scale for a subject collection, in accordance with one embodiment, is shown. As illustrated, for the embodiment, collection rater 110 generates the collection rating for a rating scale for a subject collection by first determining the individual document ratings for a subset of the documents of the subject collection, block 302. In one embodiment, the subject collection comprises textual as well as non-textual, such as multi-media, documents. For the embodiment, the subset of the documents is the textual documents. The determination of the individual document ratings for the textual documents may be made in accordance with any one of a number of document rating techniques, e.g. by the salient features or keywords of each of the document. Examples of these document rating techniques include but are not limited to those described in U.S. Provisional Applications Nos. 60/289,400 and 60/289,418, entitled “METHOD AND APPARATUS FOR AUTOMATICALLY DETERMINING SALIENT FEATURES FOR OBJECT CLASSIFICATION” and “VERY-LARGE-SCALE AUTOMATIC CATEGORIZER FOR WEB CONTENT” respectively, both filed on May 7, 2001. Both applications are hereby fully incorporated by reference.
  • [0035]
    In accordance with the present invention, in addition to determining the individual document ratings of the subset of the documents, collection rater 110 further determines the sizes of the documents, block 304. Then, collection rater 110 determines the collection rating by combining the determined individual document ratings in a size and rating normalized manner, block 306.
  • [0036]
    More specifically, in one embodiment, collection rater 110 combines the determined individual document ratings in a size and rating normalized manner, by grouping the documents in accordance with their determined sizes and determined ratings, and applying weights to the determined document ratings in accordance with their size group and rating group membership. In one embodiment, the weights are applied in accordance with the group sizes and determined ratings as set forth by the tables below:
    Document size range in (bytes) Weight
    <500  1
    500-999 4
    1000-4999 7
    5000-9999 10
    >9999 13
    Determined document rating
    for said rating scale Weight
       0 −0.5
       1 0.5
       2 3
       3 6
  • [0037]
    The weights are applied in accordance with the formula set forth below: CR = i , j r i w j log ( N ij + 1 ) i , j w j log ( N ij + 1 )
  • [0038]
    where CR is the collection rating for the rating scale;
  • [0039]
    ri is the weight applied for document rating group i;
  • [0040]
    wi is the weight applied for document size group j;
  • [0041]
    Nij is the number of pages in the collection with document rating i and having group sizes j for the rating scale.
  • [0042]
    In alternate embodiments, for different rating scales, different rating and/or group size based weighting schemes, as well as other weighting schemes may be employed instead.
  • Link Rating
  • [0043]
    Referring now to FIG. 4, wherein a block diagram illustrating a manner collection rater 110 generates a link rating for a rating scale for a subject collection, in accordance with one embodiment, is shown. As illustrated, for the embodiment, collection rater 110 generates the link rating for a rating scale for a subject collection by first generating the collection ratings for the collections having contents either linked to or linked by contents of the subject collection, block 402. The collection rating for the rating scale for each of the collection with contents either linked to or linked by contents of the subject collection, may be generated in the same manner the collection rating for the rating scale for the subject collection is generated, e.g. as earlier described, or in a different manner.
  • [0044]
    Upon so determining, for the illustrated embodiment, collection rater 110 sums the determined collection ratings for the rating scale for the other collections, block 404, then generates the link rating based on the resulting sum, block 406. In one embodiment, collection rater 110 generates the link rating based on the resulting sum in accordance with the discrete “step” function set forth below:
    The resulting sum (RS) link rating
    RS less than −2 −1.0  
    RS greater than or equal to −2, −0.5  
    but less than −1
    RS greater than or equal to −1, 0  
    but less than or equal to −0.5
    RS greater than −0.5, but less 0.5
    than or equal to 1.5
    RS greater than 1.5, but less 1.0
    than or equal to 3
    RS greater than 3, but less than 1.5
    than or equal to 4
    RS greater than 4 2.0
  • [0045]
    In alternate embodiments, the link rating may be generated from the determined collection ratings of the “linked” collections employing different functions.
  • [0046]
    Accordingly, under the present invention, “linked” contents are taken into consideration to potentially strengthen the accuracy of the rating generated for a rating scale for a subject collection. As those skilled in the art would appreciate, the present invention may be practiced for one or more rating scales on one or more subject collections, each having zero or more “linked” collections. Subject collections with zero “linked” collection is merely a degenerate case where no “linked” content contribution can be extracted to potentially strengthen the accuracy of the ratings generated for the rating scales for the subject collections.
  • Example Computer System
  • [0047]
    [0047]FIG. 5 illustrates an exemplary computer system 500 suitable for use to practice the present invention, in accordance with one embodiment. As shown, computer system 500 includes one or more processors 502 and system memory 504. Additionally, computer system 500 includes one or more mass storage devices 506 (such as diskette, hard drive, CDROM and so forth), one or more input/output devices 508 (such as keyboard, cursor control and so forth) and communication interfaces 510 (such as network interface cards, modems and so forth). The elements are coupled to each other via system bus 512, which represents one or more buses. In the case of multiple buses, they are bridged by one or more bus bridges (not shown). Each of these elements performs its conventional functions known in the art. In particular, system memory 504 and mass storage 506 are employed to store a working copy (514 a) and a permanent copy (514 b) of the programming instructions implementing the teachings of the present invention (collection categorizer). The permanent copy (514 b) of the programming instructions may be loaded into mass storage 506 in the factory, or in the field, as described earlier, through a distribution medium (not shown) or through communication interface 510 (from a distribution server (not shown)). The constitution of these elements 502-512 are known, and accordingly will not be further described.
  • [0048]
    In alternate embodiments, the present invention may be practice on multiple systems sharing common and/or networked storage.
  • Modifications and Alterations
  • [0049]
    While the present invention has been described referencing the illustrated and above enumerated embodiments, the present invention is not limited to these described embodiments. Numerous modification and alterations may be made, consistent with the scope of the present invention as set forth in the claims to follow. Of course, the above examples are merely illustrative. Based on the above descriptions, many other equivalent variations will be appreciated by those skilled in the art.
  • Conclusion and Epilogue
  • [0050]
    Thus, a method and apparatus for generating a collection rating for a document collection comprising textual and non-textual documents, has been described. Since as illustrated earlier, the present invention may be practiced with modification and alteration within the spirit and scope of the appended claims, the description is to be regarded as illustrative, instead of being restrictive on the present invention.
Patent Citations
Cited PatentFiling datePublication dateApplicantTitle
US5706507 *Jul 5, 1995Jan 6, 1998International Business Machines CorporationSystem and method for controlling access to data located on a content server
US5708822 *May 31, 1995Jan 13, 1998Oracle CorporationMethods and apparatus for thematic parsing of discourse
US5734796 *Sep 29, 1995Mar 31, 1998Ai Ware, Inc.Self-organization of pattern data with dimension reduction through learning of non-linear variance-constrained mapping
US5768580 *May 31, 1995Jun 16, 1998Oracle CorporationMethods and apparatus for dynamic classification of discourse
US5812995 *Mar 24, 1997Sep 22, 1998Matsushita Electric Industrial Co., Ltd.Electronic document filing system for registering and retrieving a plurality of documents
US5835905 *Apr 9, 1997Nov 10, 1998Xerox CorporationSystem for predicting documents relevant to focus documents by spreading activation through network representations of a linked collection of documents
US5870744 *Jun 30, 1997Feb 9, 1999Intel CorporationVirtual people networking
US5911043 *Oct 1, 1996Jun 8, 1999Baker & Botts, L.L.P.System and method for computer-based rating of information retrieved from a computer network
US5920864 *Sep 9, 1997Jul 6, 1999International Business Machines CorporationMulti-level category dynamic bundling for content distribution
US6006221 *Aug 14, 1996Dec 21, 1999Syracuse UniversityMultilingual document retrieval system and method using semantic vector matching
US6018733 *Sep 12, 1997Jan 25, 2000Infoseek CorporationMethods for iteratively and interactively performing collection selection in full text searches
US6055540 *Jun 13, 1997Apr 25, 2000Sun Microsystems, Inc.Method and apparatus for creating a category hierarchy for classification of documents
US6073137 *Oct 31, 1997Jun 6, 2000MicrosoftMethod for updating and displaying the hierarchy of a data store
US6101515 *May 31, 1996Aug 8, 2000Oracle CorporationLearning system for classification of terminology
US6163778 *Feb 6, 1998Dec 19, 2000Sun Microsystems, Inc.Probabilistic web link viability marker and web page ratings
US6233575 *Jun 23, 1998May 15, 2001International Business Machines CorporationMultilevel taxonomy based on features derived from training documents classification using fisher values as discrimination values
US6249785 *May 6, 1999Jun 19, 2001Mediachoice, Inc.Method for predicting ratings
US6266664 *Oct 1, 1998Jul 24, 2001Rulespace, Inc.Method for scanning, analyzing and rating digital information content
US6285999 *Jan 9, 1998Sep 4, 2001The Board Of Trustees Of The Leland Stanford Junior UniversityMethod for node ranking in a linked database
US6353825 *Jul 30, 1999Mar 5, 2002Verizon Laboratories Inc.Method and device for classification using iterative information retrieval techniques
US6430558 *Aug 2, 1999Aug 6, 2002Zen Tech, Inc.Apparatus and methods for collaboratively searching knowledge databases
US6473753 *Dec 18, 1998Oct 29, 2002Microsoft CorporationMethod and system for calculating term-document importance
US6592627 *Jun 10, 1999Jul 15, 2003International Business Machines CorporationSystem and method for organizing repositories of semi-structured documents such as email
US6606659 *Jan 28, 2000Aug 12, 2003Websense, Inc.System and method for controlling access to internet sites
US6684254 *May 31, 2000Jan 27, 2004International Business Machines CorporationHyperlink filter for “pirated” and “disputed” copyright material on the internet in a method, system and program
US20020120754 *Feb 28, 2001Aug 29, 2002Anderson Todd J.Category name service
Referenced by
Citing PatentFiling datePublication dateApplicantTitle
US7243092 *Aug 29, 2002Jul 10, 2007Sap AgTaxonomy generation for electronic documents
US7577650 *Apr 13, 2005Aug 18, 2009Microsoft CorporationMethod and system for ranking objects of different object types
US7716571Apr 27, 2006May 11, 2010Microsoft CorporationMultidimensional scorecard header definition
US7716592Mar 30, 2006May 11, 2010Microsoft CorporationAutomated generation of dashboards for scorecard metrics and subordinate reporting
US7840896Nov 23, 2010Microsoft CorporationDefinition and instantiation of metric based business logic reports
US7921106Apr 5, 2011Microsoft CorporationGroup-by attribute value in search results
US8116573 *Feb 28, 2007Feb 14, 2012Fujifilm CorporationCategory weight setting apparatus and method, image weight setting apparatus and method, category abnormality setting apparatus and method, and programs therefor
US8190992 *Apr 21, 2006May 29, 2012Microsoft CorporationGrouping and display of logically defined reports
US8261181Mar 30, 2006Sep 4, 2012Microsoft CorporationMultidimensional metrics-based annotation
US8321805Jan 30, 2007Nov 27, 2012Microsoft CorporationService architecture based metric views
US8495663Feb 2, 2007Jul 23, 2013Microsoft CorporationReal time collaboration using embedded data visualizations
US9058307Jan 26, 2007Jun 16, 2015Microsoft Technology Licensing, LlcPresentation generation using scorecard elements
US20020192794 *May 10, 2002Dec 19, 2002Dadd Christopher A.Process for the production of a reversibly inactive acidified plasmin composition
US20030126561 *Aug 29, 2002Jul 3, 2003Johannes WoehlerTaxonomy generation
US20030194846 *Apr 11, 2002Oct 16, 2003International Business Machines Corp.Medium dose simox over a wide BOX thickness range by a multiple implant, multiple anneal process
US20050043548 *May 17, 2004Feb 24, 2005Joseph CatesAutomated monitoring and control system for networked communications
US20060161471 *Jan 19, 2005Jul 20, 2006Microsoft CorporationSystem and method for multi-dimensional average-weighted banding status and scoring
US20060235810 *Apr 13, 2005Oct 19, 2006Microsoft CorporationMethod and system for ranking objects of different object types
US20070034739 *Apr 26, 2006Feb 15, 2007Urban Aeronautics Ltd.Ducted fan VTOL vehicles
US20070143175 *Dec 21, 2005Jun 21, 2007Microsoft CorporationCentralized model for coordinating update of multiple reports
US20070156680 *Dec 21, 2005Jul 5, 2007Microsoft CorporationDisconnected authoring of business definitions
US20070208717 *Feb 28, 2007Sep 6, 2007Fujifilm CorporationCategory weight setting apparatus and method, image weight setting apparatus and method, category abnormality setting apparatus and method, and programs therefor
US20070239573 *Mar 30, 2006Oct 11, 2007Microsoft CorporationAutomated generation of dashboards for scorecard metrics and subordinate reporting
US20070239660 *Mar 30, 2006Oct 11, 2007Microsoft CorporationDefinition and instantiation of metric based business logic reports
US20070254740 *Apr 27, 2006Nov 1, 2007Microsoft CorporationConcerted coordination of multidimensional scorecards
US20070265863 *Apr 27, 2006Nov 15, 2007Microsoft CorporationMultidimensional scorecard header definition
US20080033915 *Aug 3, 2006Feb 7, 2008Microsoft CorporationGroup-by attribute value in search results
US20080086466 *Oct 10, 2006Apr 10, 2008Bay BakerSearch method
US20080172348 *Jan 17, 2007Jul 17, 2008Microsoft CorporationStatistical Determination of Multi-Dimensional Targets
US20080172414 *Jan 17, 2007Jul 17, 2008Microsoft CorporationBusiness Objects as a Service
US20080172629 *Jan 17, 2007Jul 17, 2008Microsoft CorporationGeometric Performance Metric Data Rendering
US20080184099 *Jan 26, 2007Jul 31, 2008Microsoft CorporationData-Driven Presentation Generation
US20080184130 *Jan 30, 2007Jul 31, 2008Microsoft CorporationService Architecture Based Metric Views
US20080189632 *Feb 2, 2007Aug 7, 2008Microsoft CorporationSeverity Assessment For Performance Metrics Using Quantitative Model
WO2008040923A1 *Oct 5, 2006Apr 10, 2008Vong Enterprises LimitedSearch method
Classifications
U.S. Classification1/1, 707/E17.108, 707/999.005
International ClassificationG06F17/30
Cooperative ClassificationY10S707/99935, Y10S707/99937, Y10S707/99933, G06F17/30864
European ClassificationG06F17/30W1
Legal Events
DateCodeEventDescription
Aug 1, 2001ASAssignment
Owner name: RULESPACE, INCORPORATED, OREGON
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:LARASON, J. TODD;PACKER, ALAN J.;REEL/FRAME:012046/0722
Effective date: 20010730
Jan 21, 2003ASAssignment
Owner name: MICROSOFT CORPORATION, WASHINGTON
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:RULESPACE, INC.;REEL/FRAME:013684/0835
Effective date: 20020823
May 20, 2009FPAYFee payment
Year of fee payment: 4
Mar 23, 2010CCCertificate of correction
Mar 18, 2013FPAYFee payment
Year of fee payment: 8
Dec 9, 2014ASAssignment
Owner name: MICROSOFT TECHNOLOGY LICENSING, LLC, WASHINGTON
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MICROSOFT CORPORATION;REEL/FRAME:034541/0001
Effective date: 20141014