CA2574779A1 - Intelligent query system and method using phrase-code frequency-inverse phrase-code document frequency module - Google Patents
Intelligent query system and method using phrase-code frequency-inverse phrase-code document frequency module Download PDFInfo
- Publication number
- CA2574779A1 CA2574779A1 CA002574779A CA2574779A CA2574779A1 CA 2574779 A1 CA2574779 A1 CA 2574779A1 CA 002574779 A CA002574779 A CA 002574779A CA 2574779 A CA2574779 A CA 2574779A CA 2574779 A1 CA2574779 A1 CA 2574779A1
- Authority
- CA
- Canada
- Prior art keywords
- taxonomy
- terms
- documents
- phrase
- term
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
Abstract
An intelligent query system and method used in a search and retrieval system provides an end-user the most relevant, meaningful, up-to-date, and precise search results. The system and method allows an end-user to benefit from an experienced recommendation that is tailored to a specific industry. For example, the system and method recognizes that the phrases "strike outs" and "home run" are much more strongly correlated with "BASE" as opposed to "EQUITIES." When a search is conducted or a lookup is done in a map, the system and method recommends the strongest correlation as "BASE."
Description
INTELLIGENT QUERY SYSTEM AND METHOD USING PHRASE-CODE
FREQUENCY-INVERSE PHRASE-CODE DOCUMENT FREQUENCY
MODULE
CROSS-REFERENCE TO RELATED APPLICATION(S) This application claims the benefit of U.S. Provisional Application No.
60/590,247, entitled "INTELLIGENT QUERY SYSTEM AND METHOD USING
PHRASE-CODE FREQUENCY-INVERSE PHRASE-CODE DOCUMENT
FREQUENCY MODULE", filed on July 22, 2004, the subject matter of which is hereby incorporated by reference; and this application is also related to a co-pending patent application, U.S. Utility Application No. 11/060,928, filed on February 18, 2005, the subject matter of which is hereby incorp.orated by reference.
FIELD OF THE INVENTION
The present invention relates generally to a search and retrieval system, and more particularly, to an intelligent query system and method used in a search and retrieval system.
BACKGROUND OF THE INVENTION
Existing search query systems have been designed to help provide comprehensive search and retrieval services. However, terms or phrases used by writers may extend to different meanings that belong to different categories. For example, many documents contain phrases "strike outs" or "home run." These terms are generally related to baseball. Occasionally, these terms are also used when evaluating the performance of financial equities analysts, such as "Those Internet picks were major strike outs", or "Choosing MSFT back in'86 was a real home run."
In the existing search and retrieval systems, the documents that contain "strike outs" or "home run" in the above example, whether they are baseball documents or financial documents, are searched and retrieved. Readers can be very frustrated by wasting a lot of time in reading the irrelevant documents.
Therefore, there is a need for an intelligent query system and method that is used in a search and retrieval system capable of providing an intelligent and efficient search and retrieval. -, SUMMARY OF THE INVENTION
The present invention provides an intelligent query system and method used in a search and retrieval system with a document feed and a categorization engine.
In one embodiment of the present invention, documents about baseball are marked with a taxonomy element "BASE", and those about equities are marked with "EQUITIES". Accordingly, the intelligent query system of the present invention recognizes that the phrases "strike outs" and "home run" are much more strongly correlated with "BASE" as opposed to "EQUITIES." Therefore, when a search is conducted or a lookup is done in a map, the system recommends the strongest correlation as "BASE."
In one embodiment of the present invention, an intelligent query ("IQ") method comprises the steps of:
providing a set or stream of documents (D) which contain text, pictures (with captions or other descriptive text), video/audio (with generated text transcript), and/or the other multimedia formats;
categorizing each document into a taxonomy (C) with corresponding taxonomy elements wherein the taxonomy can be pre-defined or ad hoc;
filtering terms within the text to generate terms (Tt) and stop tenns (Ts), wherein terms (Tt) are single words which express semantic value to the document to a certain meaningful degree, and stop terms (Ts) are single words which has little or no semantic value (i.e. "the", "an", and "a");
discarding the stop terms (Ts) and defining the remaining terms (Tt) as T;
transforming the terms (T) to eliminate multi-collinearity and correlating each transformed term t to each taxonomy element c on a containing document, wherein t is an element of T, and c is an element of C;
storing t and c in a database;
counting documents that contain c;
FREQUENCY-INVERSE PHRASE-CODE DOCUMENT FREQUENCY
MODULE
CROSS-REFERENCE TO RELATED APPLICATION(S) This application claims the benefit of U.S. Provisional Application No.
60/590,247, entitled "INTELLIGENT QUERY SYSTEM AND METHOD USING
PHRASE-CODE FREQUENCY-INVERSE PHRASE-CODE DOCUMENT
FREQUENCY MODULE", filed on July 22, 2004, the subject matter of which is hereby incorporated by reference; and this application is also related to a co-pending patent application, U.S. Utility Application No. 11/060,928, filed on February 18, 2005, the subject matter of which is hereby incorp.orated by reference.
FIELD OF THE INVENTION
The present invention relates generally to a search and retrieval system, and more particularly, to an intelligent query system and method used in a search and retrieval system.
BACKGROUND OF THE INVENTION
Existing search query systems have been designed to help provide comprehensive search and retrieval services. However, terms or phrases used by writers may extend to different meanings that belong to different categories. For example, many documents contain phrases "strike outs" or "home run." These terms are generally related to baseball. Occasionally, these terms are also used when evaluating the performance of financial equities analysts, such as "Those Internet picks were major strike outs", or "Choosing MSFT back in'86 was a real home run."
In the existing search and retrieval systems, the documents that contain "strike outs" or "home run" in the above example, whether they are baseball documents or financial documents, are searched and retrieved. Readers can be very frustrated by wasting a lot of time in reading the irrelevant documents.
Therefore, there is a need for an intelligent query system and method that is used in a search and retrieval system capable of providing an intelligent and efficient search and retrieval. -, SUMMARY OF THE INVENTION
The present invention provides an intelligent query system and method used in a search and retrieval system with a document feed and a categorization engine.
In one embodiment of the present invention, documents about baseball are marked with a taxonomy element "BASE", and those about equities are marked with "EQUITIES". Accordingly, the intelligent query system of the present invention recognizes that the phrases "strike outs" and "home run" are much more strongly correlated with "BASE" as opposed to "EQUITIES." Therefore, when a search is conducted or a lookup is done in a map, the system recommends the strongest correlation as "BASE."
In one embodiment of the present invention, an intelligent query ("IQ") method comprises the steps of:
providing a set or stream of documents (D) which contain text, pictures (with captions or other descriptive text), video/audio (with generated text transcript), and/or the other multimedia formats;
categorizing each document into a taxonomy (C) with corresponding taxonomy elements wherein the taxonomy can be pre-defined or ad hoc;
filtering terms within the text to generate terms (Tt) and stop tenns (Ts), wherein terms (Tt) are single words which express semantic value to the document to a certain meaningful degree, and stop terms (Ts) are single words which has little or no semantic value (i.e. "the", "an", and "a");
discarding the stop terms (Ts) and defining the remaining terms (Tt) as T;
transforming the terms (T) to eliminate multi-collinearity and correlating each transformed term t to each taxonomy element c on a containing document, wherein t is an element of T, and c is an element of C;
storing t and c in a database;
counting documents that contain c;
increasing a correlation value between term t and taxonomy element c each time when the term t appears in the document; and continuing the above steps for all remaining documents.
With the data collected from the above process, an IQ map can be generated by the following steps:
scoring t-c pairs according to a PCF-IPCDF scoring system or model;
loading the pairs with the highest scores into a map structure for facilitating lookup of the taxonomy element c from the term element t; and deducing the taxonomy element c from term t.
One exemplary PCF-IPCDF scoring system or model is described in the co-pending patent application, U.S. Utility Application No. 11/060,928, filed on February 18, 2005, the subject matter of which is hereby incorporated by reference.
The map structure can be loaded into applications which benefit from being able to deduce relevant taxonomy elements from terms. Such applications include, but not limited to, search engines and tracking engines.
Some exemplary uses of the map (or IQ map) include guiding a user toward relevant search topics, presenting a user with a list of related taxonomy terms, and/or transparently focusing a search for a user.
Therefore, in the above baseball example, the intelligent query system of the present invention recognizes that the phrases "strike outs" and "home run" are much more strongly correlated with "BASE" as opposed to "EQUITIES." Therefore, when a lookup is done in the map, the system recommends the strongest correlation as "BASE."
With the data collected from the above process, an IQ map can be generated by the following steps:
scoring t-c pairs according to a PCF-IPCDF scoring system or model;
loading the pairs with the highest scores into a map structure for facilitating lookup of the taxonomy element c from the term element t; and deducing the taxonomy element c from term t.
One exemplary PCF-IPCDF scoring system or model is described in the co-pending patent application, U.S. Utility Application No. 11/060,928, filed on February 18, 2005, the subject matter of which is hereby incorporated by reference.
The map structure can be loaded into applications which benefit from being able to deduce relevant taxonomy elements from terms. Such applications include, but not limited to, search engines and tracking engines.
Some exemplary uses of the map (or IQ map) include guiding a user toward relevant search topics, presenting a user with a list of related taxonomy terms, and/or transparently focusing a search for a user.
Therefore, in the above baseball example, the intelligent query system of the present invention recognizes that the phrases "strike outs" and "home run" are much more strongly correlated with "BASE" as opposed to "EQUITIES." Therefore, when a lookup is done in the map, the system recommends the strongest correlation as "BASE."
These and other features and advantages of the present invention will become apparent to those skilled in the art from the attached detailed descriptions, wherein it is shown, and described illustrative embodiments of the present invention, including best modes contemplated for carrying out the invention. As it will be realized, the invention is capable of modifications in various obvious aspects, all without departing from the spirit and scope of the present invention. Accordingly, the descriptions are to be regarded as i,llustrative in nature and not restrictive.
BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 illustrates a flow chart of one exemplary intelligent query process in accordance with the principles of the present invention.
Figure 2 illustrates a flow chart of one exemplary process of generating an IQ
map in the intelligent query process in accordance with the principles of the present invention.
DETAILED DESCRIPTIONS OF THE PREFERRED EMBODIMENT
The present invention provides an intelligent query system and method used in a search and retrieval system with a document feed and a categorization engine.
Figure 1 shows an exemplary intelligent query process 100 in accordance with the principles of the present invention. The process 100 starts with a step 102 of providing a set or stream of documents (D) which contain text, pictures (with captions or other descriptive text), video/audio (with generated text transcript), and/or the other multimedia formats. Then, each document is categorized into a taxonomy (C) with corresponding taxonomy elements wherein the taxonomy can be pre-defined or ad hoc in a step 104. In the next step 106, terms within the text are filtered to generate terms (Tt) and stop terms (Ts), wherein terms (Tt) are single words which express semantic value to the document to a certain meaningful degree, and stop terms (Ts) are single words which has little or no semantic value (i.e. "the", "an", and "a"). Then, the stop terms (Ts) are discarded, and the remaining terms (Tt) are defined as T in a step 108. Next, the terms (T) are transformed to eliminate multi-collinearity and correlate each transformed term t to each taxonomy element c on a containing document, wherein t is an element of T, and c is an element of C, in a step 110. t and c are then stored in a database in a step 112. Then, documents that contain c are counted in a step 114. In a next step 116, a correlation value between term t and taxonomy element c is increased each time when the term t appears in the document. The above steps are repeated for all remaining documents.
Figure 2 shows one exemplary process 200 of generating an IQ map in the intelligent query process in accordance with the principles of the present invention. The process 200 starts with a step 202 of scoring t-c pairs according to a PCF-IPCDF scoring system or model. Then, in a step 204, the t-c pairs are loaded with the highest scores into a map structure for facilitating lookup of the taxonomy element c from the term element t. Next, the taxonomy element c is deduced from the term element t in a step 206.
It is noted that an exemplary PCF-IPCDF scoring system or model has been described in the co-pending patent application, U.S. Utility Application No.
11/060,928, filed on February 18, 2005, the subject matter of which is hereby incorporated by reference.
The map structure can be loaded into applications which benefit from being able to deduce relevant taxonomy elements from terms. Such applications include, but not limited to, search engines and tracking engines.
As a result, documents about baseball are marked with a taxonomy element "BASE", and those about equities are marked with "EQUITIES". The intelligent query system of the present invention recognizes that the phrases "strike outs" and "home run"
are much more strongly correlated with "BASE" as opposed to "EQUITIES."
Therefore, when a search is conducted or a lookup is done in a map, the system recommends the strongest correlation as "BASE."
One of the advantages of the present invention is that it provides end-users the most relevant, meaningful, up-to-date, and precise search results.
Another advantage of the present invention is that an end-user is able to benefit from an experienced recommendation that is tailored to a specific industry.
These and other features and advantages of the present invention will become apparent to those skilled in the art from the attached detailed descriptions, wherein it is shown, and described illustrative embodiments of the present invention, including best modes contemplated for carrying out the invention. As it will be realized, the invention is capable of modifications in various obvious aspects, all without departing from the spirit and scope of the present invention. Accordingly, the above detailed descriptions are to be regarded as illustrative in nature and not restrictive.
BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 illustrates a flow chart of one exemplary intelligent query process in accordance with the principles of the present invention.
Figure 2 illustrates a flow chart of one exemplary process of generating an IQ
map in the intelligent query process in accordance with the principles of the present invention.
DETAILED DESCRIPTIONS OF THE PREFERRED EMBODIMENT
The present invention provides an intelligent query system and method used in a search and retrieval system with a document feed and a categorization engine.
Figure 1 shows an exemplary intelligent query process 100 in accordance with the principles of the present invention. The process 100 starts with a step 102 of providing a set or stream of documents (D) which contain text, pictures (with captions or other descriptive text), video/audio (with generated text transcript), and/or the other multimedia formats. Then, each document is categorized into a taxonomy (C) with corresponding taxonomy elements wherein the taxonomy can be pre-defined or ad hoc in a step 104. In the next step 106, terms within the text are filtered to generate terms (Tt) and stop terms (Ts), wherein terms (Tt) are single words which express semantic value to the document to a certain meaningful degree, and stop terms (Ts) are single words which has little or no semantic value (i.e. "the", "an", and "a"). Then, the stop terms (Ts) are discarded, and the remaining terms (Tt) are defined as T in a step 108. Next, the terms (T) are transformed to eliminate multi-collinearity and correlate each transformed term t to each taxonomy element c on a containing document, wherein t is an element of T, and c is an element of C, in a step 110. t and c are then stored in a database in a step 112. Then, documents that contain c are counted in a step 114. In a next step 116, a correlation value between term t and taxonomy element c is increased each time when the term t appears in the document. The above steps are repeated for all remaining documents.
Figure 2 shows one exemplary process 200 of generating an IQ map in the intelligent query process in accordance with the principles of the present invention. The process 200 starts with a step 202 of scoring t-c pairs according to a PCF-IPCDF scoring system or model. Then, in a step 204, the t-c pairs are loaded with the highest scores into a map structure for facilitating lookup of the taxonomy element c from the term element t. Next, the taxonomy element c is deduced from the term element t in a step 206.
It is noted that an exemplary PCF-IPCDF scoring system or model has been described in the co-pending patent application, U.S. Utility Application No.
11/060,928, filed on February 18, 2005, the subject matter of which is hereby incorporated by reference.
The map structure can be loaded into applications which benefit from being able to deduce relevant taxonomy elements from terms. Such applications include, but not limited to, search engines and tracking engines.
As a result, documents about baseball are marked with a taxonomy element "BASE", and those about equities are marked with "EQUITIES". The intelligent query system of the present invention recognizes that the phrases "strike outs" and "home run"
are much more strongly correlated with "BASE" as opposed to "EQUITIES."
Therefore, when a search is conducted or a lookup is done in a map, the system recommends the strongest correlation as "BASE."
One of the advantages of the present invention is that it provides end-users the most relevant, meaningful, up-to-date, and precise search results.
Another advantage of the present invention is that an end-user is able to benefit from an experienced recommendation that is tailored to a specific industry.
These and other features and advantages of the present invention will become apparent to those skilled in the art from the attached detailed descriptions, wherein it is shown, and described illustrative embodiments of the present invention, including best modes contemplated for carrying out the invention. As it will be realized, the invention is capable of modifications in various obvious aspects, all without departing from the spirit and scope of the present invention. Accordingly, the above detailed descriptions are to be regarded as illustrative in nature and not restrictive.
Claims (3)
1. An intelligent query method, comprising the steps of:
providing a plurality of documents which contain multimedia contents;
categorizing each of the documents into a taxonomy with corresponding taxonomy elements wherein the taxonomy is pre-defined;
filtering/transforming the multimedia contents and discarding a portion of the taxonomy elements;
storing the filtered/transformed multimedia contents in a database; and calculating a correlation value of the filtered/transformed multimedia contents.
providing a plurality of documents which contain multimedia contents;
categorizing each of the documents into a taxonomy with corresponding taxonomy elements wherein the taxonomy is pre-defined;
filtering/transforming the multimedia contents and discarding a portion of the taxonomy elements;
storing the filtered/transformed multimedia contents in a database; and calculating a correlation value of the filtered/transformed multimedia contents.
2. An intelligent query method used in a search and retrieval system, comprising the steps of:
providing a plurality of documents which contain multimedia contents including text;
categorizing each of the documents into a taxonomy with corresponding taxonomy elements wherein the taxonomy is pre-defined;
filtering terms within the text to generate terms (Tt) and stop terms (Ts), wherein terms (Tt) are single words which express semantic value to the document, and stop terms (Ts) are single words which express no semantic value;
discarding the stop terms (Ts) and defining the remaining terms (Tt) as T;
transforming the terms (T) to eliminate multi-collinearity and correlating the transformed terms t to each taxonomy element c on a containing document, wherein t is an element of T, and c is an element of C;
storing t and c in a database;
counting the documents that contain c; and increasing a correlation value between term t and taxonomy element c each time when the term t appears in the document.
providing a plurality of documents which contain multimedia contents including text;
categorizing each of the documents into a taxonomy with corresponding taxonomy elements wherein the taxonomy is pre-defined;
filtering terms within the text to generate terms (Tt) and stop terms (Ts), wherein terms (Tt) are single words which express semantic value to the document, and stop terms (Ts) are single words which express no semantic value;
discarding the stop terms (Ts) and defining the remaining terms (Tt) as T;
transforming the terms (T) to eliminate multi-collinearity and correlating the transformed terms t to each taxonomy element c on a containing document, wherein t is an element of T, and c is an element of C;
storing t and c in a database;
counting the documents that contain c; and increasing a correlation value between term t and taxonomy element c each time when the term t appears in the document.
3. The method of claim 2, further comprising a step of generating an IQ map, which comprises:
scoring t-c pairs according to a PCF-IPCDF scoring system or model;
loading the t-c pairs with the highest scores into a map structure for facilitating lookup of the taxonomy element c from the term element t; and deducing the taxonomy element c from the term element t.
scoring t-c pairs according to a PCF-IPCDF scoring system or model;
loading the t-c pairs with the highest scores into a map structure for facilitating lookup of the taxonomy element c from the term element t; and deducing the taxonomy element c from the term element t.
Applications Claiming Priority (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US59024704P | 2004-07-22 | 2004-07-22 | |
US60/590,247 | 2004-07-22 | ||
US11/112,439 US7698333B2 (en) | 2004-07-22 | 2005-04-22 | Intelligent query system and method using phrase-code frequency-inverse phrase-code document frequency module |
US11/112,439 | 2005-04-22 | ||
PCT/US2005/013969 WO2006022897A1 (en) | 2004-07-22 | 2005-04-25 | Intelligent query system and method using phrase-code frequency-inverse phrase-code document frequency module |
Publications (1)
Publication Number | Publication Date |
---|---|
CA2574779A1 true CA2574779A1 (en) | 2006-03-02 |
Family
ID=35758608
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CA002574779A Abandoned CA2574779A1 (en) | 2004-07-22 | 2005-04-25 | Intelligent query system and method using phrase-code frequency-inverse phrase-code document frequency module |
Country Status (5)
Country | Link |
---|---|
US (1) | US7698333B2 (en) |
EP (1) | EP1782273A1 (en) |
AU (1) | AU2005278138A1 (en) |
CA (1) | CA2574779A1 (en) |
WO (1) | WO2006022897A1 (en) |
Families Citing this family (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7249034B2 (en) | 2002-01-14 | 2007-07-24 | International Business Machines Corporation | System and method for publishing a person's affinities |
US7917519B2 (en) * | 2005-10-26 | 2011-03-29 | Sizatola, Llc | Categorized document bases |
US9396505B2 (en) | 2009-06-16 | 2016-07-19 | Medicomp Systems, Inc. | Caregiver interface for electronic medical records |
US10319466B2 (en) * | 2012-02-20 | 2019-06-11 | Medicomp Systems, Inc | Intelligent filtering of health-related information |
US8954463B2 (en) * | 2012-02-29 | 2015-02-10 | International Business Machines Corporation | Use of statistical language modeling for generating exploratory search results |
EP2680172A3 (en) * | 2012-06-29 | 2014-01-22 | Orange | Other user content-based collaborative filtering |
US11928606B2 (en) | 2013-03-15 | 2024-03-12 | TSG Technologies, LLC | Systems and methods for classifying electronic documents |
US9298814B2 (en) | 2013-03-15 | 2016-03-29 | Maritz Holdings Inc. | Systems and methods for classifying electronic documents |
US10430906B2 (en) | 2013-03-15 | 2019-10-01 | Medicomp Systems, Inc. | Filtering medical information |
WO2014145824A2 (en) | 2013-03-15 | 2014-09-18 | Medicomp Systems, Inc. | Electronic medical records system utilizing genetic information |
US10089687B2 (en) * | 2015-08-04 | 2018-10-02 | Fidelity National Information Services, Inc. | System and associated methodology of creating order lifecycles via daisy chain linkage |
Family Cites Families (30)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP3132738B2 (en) * | 1992-12-10 | 2001-02-05 | ゼロックス コーポレーション | Text search method |
US5758257A (en) | 1994-11-29 | 1998-05-26 | Herz; Frederick | System and method for scheduling broadcast of and access to video programs and other data using customer profiles |
GB2300991B (en) * | 1995-05-15 | 1997-11-05 | Andrew Macgregor Ritchie | Serving signals to browsing clients |
US5724571A (en) | 1995-07-07 | 1998-03-03 | Sun Microsystems, Inc. | Method and apparatus for generating query responses in a computer-based document retrieval system |
US6067552A (en) * | 1995-08-21 | 2000-05-23 | Cnet, Inc. | User interface system and method for browsing a hypertext database |
US6038561A (en) * | 1996-10-15 | 2000-03-14 | Manning & Napier Information Services | Management and analysis of document information text |
US5924090A (en) | 1997-05-01 | 1999-07-13 | Northern Light Technology Llc | Method and apparatus for searching a database of records |
US6233575B1 (en) | 1997-06-24 | 2001-05-15 | International Business Machines Corporation | Multilevel taxonomy based on features derived from training documents classification using fisher values as discrimination values |
US6292830B1 (en) * | 1997-08-08 | 2001-09-18 | Iterations Llc | System for optimizing interaction among agents acting on multiple levels |
US5960422A (en) * | 1997-11-26 | 1999-09-28 | International Business Machines Corporation | System and method for optimized source selection in an information retrieval system |
US6418433B1 (en) | 1999-01-28 | 2002-07-09 | International Business Machines Corporation | System and method for focussed web crawling |
US6711585B1 (en) * | 1999-06-15 | 2004-03-23 | Kanisa Inc. | System and method for implementing a knowledge management system |
US6260041B1 (en) * | 1999-09-30 | 2001-07-10 | Netcurrents, Inc. | Apparatus and method of implementing fast internet real-time search technology (first) |
US6868525B1 (en) * | 2000-02-01 | 2005-03-15 | Alberti Anemometer Llc | Computer graphic display visualization system and method |
US7035864B1 (en) * | 2000-05-18 | 2006-04-25 | Endeca Technologies, Inc. | Hierarchical data-driven navigation system and method for information retrieval |
US6910035B2 (en) * | 2000-07-06 | 2005-06-21 | Microsoft Corporation | System and methods for providing automatic classification of media entities according to consonance properties |
US6657117B2 (en) * | 2000-07-14 | 2003-12-02 | Microsoft Corporation | System and methods for providing automatic classification of media entities according to tempo properties |
US20030217052A1 (en) | 2000-08-24 | 2003-11-20 | Celebros Ltd. | Search engine method and apparatus |
US6735583B1 (en) * | 2000-11-01 | 2004-05-11 | Getty Images, Inc. | Method and system for classifying and locating media content |
US6873990B2 (en) * | 2001-02-07 | 2005-03-29 | International Business Machines Corporation | Customer self service subsystem for context cluster discovery and validation |
US20030014405A1 (en) | 2001-07-09 | 2003-01-16 | Jacob Shapiro | Search engine designed for handling long queries |
US7249034B2 (en) | 2002-01-14 | 2007-07-24 | International Business Machines Corporation | System and method for publishing a person's affinities |
US6801905B2 (en) | 2002-03-06 | 2004-10-05 | Sybase, Inc. | Database system providing methodology for property enforcement |
US7437349B2 (en) | 2002-05-10 | 2008-10-14 | International Business Machines Corporation | Adaptive probabilistic query expansion |
WO2004013770A2 (en) * | 2002-07-26 | 2004-02-12 | Ron Everett | Data management architecture associating generic data items using reference |
US7146361B2 (en) | 2003-05-30 | 2006-12-05 | International Business Machines Corporation | System, method and computer program product for performing unstructured information management and automatic text analysis, including a search operator functioning as a Weighted AND (WAND) |
US7447677B2 (en) * | 2003-06-27 | 2008-11-04 | Microsoft Corporation | System and method for enabling client applications to interactively obtain and present taxonomy information |
US7577655B2 (en) * | 2003-09-16 | 2009-08-18 | Google Inc. | Systems and methods for improving the ranking of news articles |
CA2556023A1 (en) * | 2004-02-20 | 2005-09-09 | Dow Jones Reuters Business Interactive, Llc | Intelligent search and retrieval system and method |
US7266548B2 (en) * | 2004-06-30 | 2007-09-04 | Microsoft Corporation | Automated taxonomy generation |
-
2005
- 2005-04-22 US US11/112,439 patent/US7698333B2/en active Active
- 2005-04-25 CA CA002574779A patent/CA2574779A1/en not_active Abandoned
- 2005-04-25 AU AU2005278138A patent/AU2005278138A1/en not_active Abandoned
- 2005-04-25 EP EP05749881A patent/EP1782273A1/en not_active Withdrawn
- 2005-04-25 WO PCT/US2005/013969 patent/WO2006022897A1/en active Application Filing
Also Published As
Publication number | Publication date |
---|---|
WO2006022897A1 (en) | 2006-03-02 |
US7698333B2 (en) | 2010-04-13 |
AU2005278138A1 (en) | 2006-03-02 |
US20060031218A1 (en) | 2006-02-09 |
EP1782273A1 (en) | 2007-05-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20060031218A1 (en) | Intelligent query system and method using phrase-code frequency-inverse phrase-code document frequency module | |
US7769751B1 (en) | Method and apparatus for classifying documents based on user inputs | |
US10565233B2 (en) | Suffix tree similarity measure for document clustering | |
US8176418B2 (en) | System and method for document collection, grouping and summarization | |
US7249121B1 (en) | Identification of semantic units from within a search query | |
US8200597B2 (en) | System and method for classifiying text and managing media contents using subtitles, start times, end times, and an ontology library | |
US6519586B2 (en) | Method and apparatus for automatic construction of faceted terminological feedback for document retrieval | |
Rehman et al. | Selection of the most relevant terms based on a max-min ratio metric for text classification | |
US9015194B2 (en) | Root cause analysis using interactive data categorization | |
US8140550B2 (en) | System and method for bounded analysis of multimedia using multiple correlations | |
US20100070507A1 (en) | Hybrid content recommending server, system, and method | |
US9256649B2 (en) | Method and system of filtering and recommending documents | |
US20100057559A1 (en) | method of choosing advertisements to be shown to a search engine user | |
EP1426881A2 (en) | Information storage and retrieval | |
Benitez et al. | Semantic knowledge construction from annotated image collections | |
Dobrynin et al. | Contextual document clustering | |
Ly et al. | Product review summarization based on facet identification and sentence clustering | |
Ju et al. | A weighting scheme for tag recommendation in social bookmarking systems | |
Wondergem et al. | Matching index expressions for information retrieval | |
Wen et al. | A multi-paradigm querying approach for a generic multimedia database management system | |
Jain et al. | Building query optimizers for information extraction: the sqout project | |
Perea-Ortega et al. | Semantic tagging of video ASR transcripts using the web as a source of knowledge | |
US8868543B1 (en) | Finding web pages relevant to multimedia streams | |
JP2003167891A (en) | Word significance calculating method, device, program and recording medium | |
Torjmen et al. | XML Multimedia Retrieval: From relevant textual information to relevant multimedia fragments |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
FZDE | Discontinued |