|Publication number||US7685118 B2|
|Application number||US 10/916,617|
|Publication date||Mar 23, 2010|
|Filing date||Aug 12, 2004|
|Priority date||Aug 12, 2004|
|Also published as||US20060047632|
|Publication number||10916617, 916617, US 7685118 B2, US 7685118B2, US-B2-7685118, US7685118 B2, US7685118B2|
|Original Assignee||Iwint International Holdings Inc.|
|Export Citation||BiBTeX, EndNote, RefMan|
|Patent Citations (23), Referenced by (41), Classifications (5), Legal Events (5)|
|External Links: USPTO, USPTO Assignment, Espacenet|
1. Field of the Invention
The present invention relates to solution automation of inventor and user problems, and more particularly, to using semantic methods of information and knowledge representation and processing for solving such problems.
2. Related Art
Solving inventor problems and technical problems of a user may require, first of all, good information support, i.e. operative access to information or knowledge. Good information support can answer how to solve the problem or can facilitate providing information related to the solution of the problem (for example, information in another knowledge domain, in the same knowledge domain but in other type of system, etc.) that is able to point to the inventor or user the needed direction for the solution search. Conventionally, computer based information retrieval may be performed by means of a search engine.
In unsophisticated information retrieval systems, a search may be performed by searching for the presence of key words (inputted by the user) in documents contained in a database. This kind of search may be characterized by low precision and recall. Modern information retrieval systems should provide the user the possibility of formulating a query in natural language, i.e. the systems should have a natural language user interface. Then, the automatic linguistic analysis of the query is performed and its formal representation created. The linguistic analysis can be performed at different levels of depth of natural language. This analysis, in an ideal case, should include the semantic level. It is important to recognize not only relations between different elements of the query (usually, the most informative elements), but as well the relations between query elements and the corresponding components from an outer world model or a certain knowledge domain. Thus, it became desirable to use semantic relations between concepts, described in models of knowledge representation such as a thesaurus or ontology, to improve information retrieval system performance in different manners in various applications.
Ontology is a hierarchical lexical structure where concepts expressed by words or word-combinations are defined and are linked with semantic relations. Ontologies can be domain-specific or general depending on the terms they describe and attempt to reflect a human's knowledge about the specific domain or the surrounding world. Since ontology represents a valuable and extensive set of data, ontology can be successfully used in information retrieval to improve the precision and the recall of search results.
Some information retrieval systems like the system described in U.S. Pat. No. 6,675,159 B1 (“the '159 patent”), the contents of which is incorporated herein by reference in its entirety, use ontologies to index collections of documents with ontology-based predicate structures. The system of the '159 patent extracts the concepts behind user queries to return only the documents that match these concepts. The system has the capabilities of an ontology-based search system and it can search for logically structured groupings of items from the ontology. For example, from an exemplary query “What is the current situation of the stock market?” an attribute extractor extracts direct attributes “current”, “situation”, “stock”, and “market” from the query. The attribute extractor can also, e.g., expand attribute “stock” to “finance”, “banks”, “brokerages”, “Wall Street”, etc. by using an ontology that contains hierarchically-arranged concepts.
The information presented in a knowledge base search and retrieval system, described in U.S. Pat. No. 5,940,821, (“the '821 patent”) and the related document knowledge base research and retrieval system, described in U.S. Pat. No. 6,460,034 B1, (“the '034 patent”) (the contents of both of which are incorporated herein by reference in their entireties) use a knowledge base (that stores associations among terminology/categories that have a lexical, semantic or usage association.) for document theme vectors identification (by inferring topics from terminology of a document), document classification in categories and are also able to retrieve a relevant document in response to a query by expanding query terms and the theme with the help of the knowledge base. The '034 system includes factual knowledge base queries as well as concept knowledge base queries. The factual knowledge base queries identify, in response to a query, the relevant themes, and the documents classified for those themes. In contrast, the concept knowledge base queries do not identify specific documents in response to a query, but identify the potential existence of a document by displaying associated categories and themes.
The content processing system of the '821 and '034 patents include a linguistic engine, a knowledge catalog processor, a theme vector processor, and a morphology section. The linguistic engine, which includes a grammar parser and a theme parser, processes the document set by analyzing the grammatical or contextual aspects of each document, as well as analyzing the stylistic and thematic attributes of each document. Specifically, the linguistic engine generates, as part of the structured output, contextual tags, thematic tags, and stylistic tags that characterize each document.
The knowledge base of the '821 and '034 patents is used to generate an expanded set of query terms, and the expanded query term set is used to select additional documents. To expand a query term using the knowledge base, the levels or tiers of the classification hierarchy as well as the knowledge base associations are used to select nodes within predefined criteria. In one embodiment, the query term strength is decreased based on the distance weight (e.g. query term weights are decreased by 50% for each point of semantic distance when expanding either to a more general category, (e.g. a parent category) or to an association), and all nodes with a resultant query term weight greater than one are selected. All child categories and terms beneath a node are selected.
However, the system of the '821 and '034 patents is oriented mainly to theme vector identification. The '034 system requires the documents from the retrieved database to be indexed with special contextual, thematic and stylistic tags and the query terms expansion based on ontology is used to retrieve additional documents taking into consideration its theme vector.
Ontology also is conventionally applied in database management systems. In International patent application publication No. WO-2003/030025A1 (“the '025 publication”), the contents of which is incorporated herein by reference in its entirety, the database management system uses ontology to solve the problems of semantic heterogeneity, and semantic mismatch and query integration against distributed resources. The proposed solution to the problems of semantic heterogeneity is to formally specify the meaning of the terminology of each system using ontologies (shared and personal ones). Thus, the system of the '025 publication provides a distributed query solution for a network having a plurality of database resources. The network helps users to make queries which retrieve and join data from more than one resource, which may be of more than one type such as an SQL or XML database.
Consequently, ontology is used in the system of the '025 publication to disambiguate terminology vagueness while retrieving information from different heterogeneous information resources.
In U.S. patent application Publication No. 2002/0107844 A1 (“the '844 publication”), the contents of which is incorporated herein by reference in its entirety, ontology is referred to be used in an information generation and retrieval system as an instrument that helps to build semantic representation of the sentence in the form of a conceptual graph. During the information request procedure, a natural language query of a user is transformed to the conceptual graph by analyzing sentence structure and semantic structure and then a conceptual graph in a database, which is nearest to the conceptual graph of the query with respect to sense is searched and semantic appropriateness is computed to display information indexed by the searched conceptual graph to the user.
Thus, the application of ontology in information retrieval implies conceptual graph building as well as the query and database conceptual graph comparison.
The method and apparatus for active information discovery and retrieval, described in U.S. Pat. No. 6,498,795 B1, (“the '795 patent”), the contents of which is incorporated herein by reference in its entirety, use an active network framework and an ontology-based information hierarchy for semantic structuring and automated information binding, and provide a symmetrical framework for information filtering and binding in the network. Queries from information requesters are directly routed to relevant information sources and contents from information providers are distributed to the destinations that expressed an interest in the information.
The method of the '795 patent implies creating content ontology instance trees and query ontology instance trees on each of the active network nodes. Active networks architecture and an ontology-based information hierarchy are used as the network and the semantic frameworks respectively. The system uses simple hypertext markup language (HTML) ontology extensions (SHOE). When a SHOE instance makes specific claims based upon a particular ontology, a software agent can draw on that particular ontology to infer knowledge that is not directly stated. The ontology provides context as implicit knowledge. SHOE tags allow defining new ontologies based on existing ones. The search operational model is applied on any part of sub-hierarchy of the ontology instance tree. Special coefficients are calculated to determine the probability of the child nodes of ontology to be accessed with the parent node of ontology.
Hence, ontology in the '795 patent is used for semantic structuring of the retrieved information, which implies previous annotation with ontological tags (using SHOE, both automatically or manually) of the information resources, and only then it is possible to retrieve information based on the ontology relations represented by SHOE tags.
In U.S. patent application Publication No. 2002/0116169 A1(“the '169 publication”), the contents of which is incorporated herein by reference in its entirety, a method and apparatus for generating normalized representations of strings is described. Ontologies, thesauri, and terminological databases are used therein as means for normalization of semantic representation of the string.
The described method of the '169 publication attempts to increase the retrieval performance of information retrieval systems by suggesting use of ontology to semantically normalize query and database strings.
An ontology-based information management system and method, described in U.S. patent application Publication No. 2003/0177112, (“the '112 publication”), the contents of which is incorporated herein by reference in its entirety, uses ontology to provide semantic mapping between entries in a structured data source, and concepts in an unstructured data source and includes processes for creating, validating, augmenting, and combining ontologies for life sciences, informatics and other disciplines. The system of the '112 publication proposes to use an ontology to enable effective syntactic and semantic mapping between mapping entities discovered using concept-based text searching, and those derived from data warehousing and mining in a plurality of disciplines.
The system of the '112 publication may evaluate the distance between a pair of terms in a given information space using an information retrieval engine capable of categorizing large document sets.
Nevertheless, the proposed method of the '112 publication is mainly oriented to managing information sources based on ontologies, which help to integrate structured and unstructured data. The information sources are the source of creating new ontologies combining them, etc. The information retrieval engine is based on the categorization of the data.
Ontology is also used for query expansion. In U.S. Pat. No. 5,822,731 (“the '731 patent”), the contents of which is incorporated herein by reference in its entirety, a semantic network is applied to maximize the number of relevant documents identified during a query search by semantically expanding the search in response to the part of speech associated with each query term in the search.
In U.S. patent application Publication No. 2001/0003183 A1, (“the '183 publication”), the contents of which is incorporated herein by reference in its entirety, a method and apparatus for knowledgebase searching is described. Ontologies are an integral part of this system. A library of query templates and a dictionary that relates keywords to more abstract concepts are first prepared on a computer system. Each template contains one or more typed variables. A query is then generated by entering into the system one or more keywords. Each keyword is abstracted to concept (using different thesauri and ontologies). Each concept may be further refined by additional abstraction, or by picking one concept from several candidates, or by successive abstraction and rejection of different keywords until an acceptable concept is found. Next, for the concepts that are obtained, the system finds all query templates are then instantiated with those concepts or with the keywords used to form the concepts. The user then selects the most appropriate query from among the instantiated query templates. The system of the '183 publication may be applied in formulating queries to access any set of information sources. The '183 publication system is particularly useful to access distributed, heterogeneous databases which do not have a single standardized vocabulary or structure.
In fact, the latter three above-mentioned methods represent key word search expansion by means of ontology with different variations.
The method and device for supporting information retrieval by using ontology, and storage medium recording information retrieval support program, described in JP-2000222436, (“the '436 publication”), the contents of which is incorporated herein by reference in its entirety, is designed to provide an information retrieval supporting method. The method is capable of dynamically preparing a database selection menu for selection of a database suited for retrieving information required by a user. The solution suggested by the author of the '436 publication is ontology describing the concept system of information managed by a database as tree structure of information concepts from a higher degree of abstraction to a lower degree of abstraction and a database selection menu for specifying the concept of information required to be retrieved by a user is dynamically generated by presenting concepts registered in the ontology stepwise from the higher degree of abstraction to the lower degree of abstraction.
Briefly, the method of the '436 publication suggests using ontology, reflecting database content, to help a user to specify the concept that is searched by refinement or generalization of the concept.
The method and system for query reformulation for searching of information is described in U.S. patent application Publication No. 20020147578 A1(“the '578 publication”), the contents of which is incorporated herein by reference in its entirety. The method provides reformulating the query by eliminating one or more non-interesting terms using semantic and syntactic information for one or more of the terms; and querying a database of information based upon the reformulated query. Numerous interrelated dictionaries, thesauri and ontologies are used in the course of processing each question.
Hence, ontology in the system of the '578 publication is a part of system that reformulates a query by eliminating non-informative terms.
Ontology is also applied in information retrieval systems to rank query feedback terms, as is described in U.S. Pat. No. 6,363,378 B1(“the '378 patent”), the contents of which is incorporated herein by reference in its entirety. The information retrieval system processes the queries, identifies topics related to the query as well as query feedback terms, and then links both the topics and feedback terms to nodes of the knowledge base with corresponding terminological concepts. At least one focal node is selected from the knowledge base based on the topics to determine a conceptual proximity between the focal node and the query feedback nodes. Hierarchical relations from ontology are used to calculate semantic proximity between focal categories and query feedback terms. The query feedback terms are ranked based on conceptual proximity to the focal node.
Thus, in the '378 patent's information retrieval system, ontology is used for topic identification in the knowledge base and in the query and then for calculating the semantic proximity between query feedback terms and the node chosen from the database on the basis of determined topic.
Therefore, the idea of using ontology to improve information retrieval system performance is not new and it is disclosed in different manners in various patents. For example, some of the different manners disclosed include searching in structured and unstructured databases, document theme or topic identification, normalization of semantic representation of the string, search and integration of different types of data, query expansion, etc. As far as ontology use in query expansion is concerned, ontology is applied, generally, to expand keyword-based and concept-based search and hierarchical relations from ontology, and may be preponderantly used in a certain knowledge domain.
An exemplary embodiment of the present invention may include a system, method and/or computer program product that may provide an ability to solve a problem such as, e.g., but not limited to, inventor problems and user problems, based on semantic methods for data/knowledge presentation and processing, implemented in a linguistic processing module. The basic components of this module may be, in an exemplary embodiment, a linguistic knowledge base (KB), an ontology KB, and/or an expert KB.
The linguistic KB, according to an exemplary embodiment of the present invention, may provide a linguistic analysis of a user's query and its formal semantic representation—verb-parameter-object (VPO), also called “a technical function,” which may be a formal specification of the problem.
The ontology KB may contain knowledge of the surrounding world, presented in a number of terms (concepts and verbs) from different knowledge domains and semantic relations binding these terms, examples include: synonymic, “kind-of” and associative relations.
The linguistic processing module may perform semantic expansion with help of the ontology KB. The linguistic processing module may provide for a maximum recall and precision of information retrieval when searching for solutions to a given problem or its analogs, which may be very important when dealing with the mentioned class of tasks. In addition to this, a user may have the possibility of varying the degree of the semantic expansion based on proximity of terms in the ontology KB.
Expert KB, in an exemplary embodiment, may be a knowledge database of solutions for technical problems, obtained from numerous text documents, mainly from patents and articles. These solutions may be presented in SVPO format, where S may be a subject, or solver or performer of a technical function defined by VPO. Comparing a semantically expanded query and solutions from the expert KB, the linguistic processing module may locate the solutions (including analogous solutions) for the given query. The output of the linguistic processing module, in an exemplary embodiment, may be a semantically sorted list of these solutions. As a result, the user may be presented with a list of, e.g., precise, particular, general and analogous solutions for the query.
According to an exemplary embodiment of the present invention, the linguistic processing module may provide an effective solution for a user's query by implementing linguistic, ontology and expert knowledge bases (KBs) and a set of tools to edit and otherwise enrich them together with semantic methods (based thereupon) for information/knowledge processing.
Ontology use may significantly improve the performance of information retrieval systems, which deal with documents that are the main information-carrying medium:
Therefore, exemplary distinctive characteristics of our approach may include the following:
1. A new method for solving problems such as, e.g, but not limited to, inventor problems and user problems, based on the linguistic processing of text documents (mainly for patents) may be provided;
2. By virtue of point 1, the linguistic processing module may provide:
In this way, the approach of the exemplary embodiment of the present invention, may actually provide effective support of professional activities of inventors and may facilitate the solving of problems of a typical user.
Further features and advantages of the invention, as well as the structure and operation of various exemplary embodiments of the invention, are described in detail below with reference to the accompanying drawings.
The foregoing and other features and advantages of the invention will be apparent from the following, more particular description of exemplary embodiments of the invention, as illustrated in the accompanying drawings. In the drawings, like reference numbers generally indicate identical, functionally similar, and/or structurally similar elements. The drawing in which an element first appears is indicated by the leftmost digits in the corresponding reference number. A preferred exemplary embodiment is discussed below in the detailed description of the following drawings:
An exemplary embodiment of the present invention may provide a way of solving problems. In an exemplary embodiment, a linguistic processing module (LPM), with help of a multi-component knowledge base (KB) for natural language and relations between entities of a certain area of interest, may provide high-quality understanding of structured and non-structured user queries and a search technique for finding a most exact and complete list of related solutions.
The Linguistic KB 132 may contain, e.g., but not limited to, rules for parsing, a lemmatization dictionary, a linguistic algorithm for leftmost word trimming and for noun phrases canonization, in an exemplary embodiment.
The Ontology KB 136 may be a hierarchical database of terms for different knowledge domains. The “term” here is used to denote a concept (term-concept) and a verb (term-verb). Before the structure and contents of Ontology KB is described, it is necessary to give the following definitions:
Synonymy—the semantic relation that holds between two words or two grammatical structures that can (in a given context) express the same meaning.
For example: “alter”, “change”, “modify”, “vary”
Direct synonyms—words or grammatical structures the have the same (or close) meaning regardless of the context.
For example: “water”, “aqua”
Syntactic synonyms—grammatical structures of different types expressing the same (or close) meaning.
For example: “to heat”, “to increase temperature”
“Kind-of” relation (hypernymy/hyponymy)—semantic relation that holds between two words or two grammatical structures where one of them represents a class of objects and the other is a particular representative of the given class.
For example: “oxygen”→“gas”, “increase”→“change”, “temperature”→“parameter”
Association—semantic relation between two words or two grammatical structures with meanings that are in coordinate relations to each other. They are called “sisters/brothers” and have the same “parent” hypernym and are referred as “children” of this hypernym.
Verb-Parameter-Object (VPO)—formal specification of the problem; Verb here denotes a technical function to be improved; Parameter (may be absent, in which case we refer to VO only) denotes a specific characteristic of the technical system or an element of a technical system to be improved; Object denotes a technical system or an element of a technical system involved into the technical function or process.
Problem: How to increase temperature of water?
VPO: V (increase) P (temperature) O (water)
Subject (S)—is a “solver” of a technical problem defined in VPO structure.
Fire increases temperature of water
S (fire) V (increase) P (temperature) O (water)
Main-word—a particular word within a noun-phrase, which defines grammatical properties of the entire noun-phrase.
Noun-phrase: cold water
Lemmatization—the process of producing the initial form of a word from its word-forms. The initial form for verbs is infinitive form of the verb, for nouns—noun in Nominative case, singular number.
verb: “moving”→“(to) move”
Synset—a set of synonymic concepts (either nouns or verbs).
Synset: “marine vessel”, “vessel”, “watercraft”
Synonym expansion—a function that takes a word (or more complex grammatical element, such as VPO) and returns a set of grammatical elements which express the same meaning.
“vessel”—is expanded into—“marine vessel”, “watercraft”
“(to) heat”—is expanded into—“increase temperature”
“Kind-of” expansion—a function that takes a word (or more complex grammatical element, such as VPO) and returns a set of grammatical elements which express more general or more specific meaning.
“marine vessel”—is expanded into—“craft” (general meaning)
“marine vessel”—is expanded into—“ice yacht”, “patrol boat”, “scooter”, etc . . . (specific meanings)
“Association” expansion—a function that takes a word (or more complex grammatical element, such as VPO) and returns a set of grammatical elements which express closely associated meanings.
“regularization”—is associated with—“regulation”, “quality control”, “restraint”, etc. . . . .
The terms in Ontology KB are grouped on the basis of the following relations:
1) “synonymy”—relation, including:
2) “Kind-of”—relation (hypernym→hyponym)
Besides, relations (1a), (2) and (3) are characteristic of term-concepts and (1a), (1b) and (2) are characteristic of term-verbs. Relation of type (1b) refers to synonyms in form of:
In order to enrich Ontology KB 136, a special computer-based toolset according to the present invention was developed which automatized the work of domain knowledge experts (or lexicographers).
Expert KB 140 (
A number of requirements must be met to correctly produce the VPO fields for the search:
“nanotube arrays”→“nanotube array”
“query of user”→“user query”
“polymers and copolymers”→“polymer”, “copolymer”
“bowl containing water”→Object: “bowl”
An example of a technical solution is shown below:
Accelerometer detects acceleration of magnetic head
O: magnetic head
The Linguistic Processing Module (
Below you can find examples of non-structured user queries and the results of their linguistic processing.
Query: How to test fatigued metals?
Structured form: V (test) O (fatigued metal)
Query: How to measure mechanical properties of MEMS material?
Structured form: V (measure) P (mechanical property) O (MEMS material)
It should be noted, that together with Linguistic KB (Unit 132) the LPM may use Ontology KB 136 during processing, which may provide terms for noun and verb phrases and, eventually, may increase the performance of processing. A parsed user's query may be a formalized VPO structure. These fields must meet the same canonization requirements as those, specified for Expert KB 140.
User Query 104 in VPO format may be further submitted to Query Expansion module (Unit 116), which may make use of the hierarchical structure of Ontology KB 136 to perform semantic term expansion. This procedure may be needed later, in order to retrieve as many problem-related solutions as possible, using Expert KB 140.
synonym expansion 372 (performed for verbs, parameters and objects);
“kind-of” expansion 376 (hypernymic-hyponymic expansion, performed only for objects); and/or
associative expansion 380 (performed only for objects).
The synonym expansion (Unit 372) may be a case when each field of user query (in VPO format) may be substituted with its corresponding synonym: direct and syntactic synonyms.
Input (user's query): change dimensions of a solid body
VPO format: V (change) P (dimension) O (body)
Output (synonym expansion): V (change, alter, modify, vary)
It should be noted that in case of syntactical synonyms (V→VP or VP→V), the resulting terms may also be expanded to obtain synonymic terms.
“Kind-of” expansion (Unit 376) is a case when each field may be substituted with the terms being in hierarchic relations (hypernym vs. hyponym) to the items specified in the query. “Kind-of” expansion may depend on the term being specified in the query and on the semantic possibilities of Ontology KB 136 to expand this term. There may be two types of “kind-of” expansions:
from a particular term to a general one (bottom-to-top);
Input (user's query): change the surface curvature of the conducting liquid drop
VPO format: V (change) P (surface curvature) O (conducting liquid drop)
Output (hypernymic V (change)
expansion, performed only for
from a general term to a particular one (top-to-bottom);
Input (user's query): change the direction of movement of the gas flow
VPO format: V (change) P (direction) O (movement)
Output (hyponymic expansion, V (change)
performed only for Object):
“Kind-of” relation 376 may give the possibility to examine more particular, more general and associated solutions.
Associative expansion (Unit 380) may be a case when each field may be substituted with the terms in associative relations to the items specified in the query.
Input (user's query): measure traveling distance
VPO format: V (measure) O (traveling distance)
Output (associative expansion, V (measure)
performed only for Object):
Expansion of the associative kind 380 may allow one to examine the solutions for similar problems (analogous solutions). Thus an expanded user query in VPO format 384 may result as shown.
The goal of Searching for Solutions module (Unit 120) is to look for the solutions in Expert KB (Unit 140) according to the obtained expansion from Query Expanding module (Unit 116) and consequently, present a list of solutions 128. The search engine may compare the VPO fields from Expert KB 140 with the obtained expansions 372, 376, 384 from Query Expanding module 116, 300. The correspondence of these fields may result in retrieval of pertinent solutions according to the user's query 104.
The solutions in Expert KB 140 which may coincide with expansions in terms of concepts and verbs to a certain degree, may be extracted and may be put into the list of solutions 128 to be presented as a result to the user. Due to the nature of those solutions they may need to be semantically sorted (in accordance with the types of expansions 372, 376, 380). Sorting of Solutions module (Unit 124) may sort all records from Expert KB 140 that were marked for output in the following order:
O (hydrochloric acid)
O (hydrochloric acid)
User's query: V (neutralize) O (hydrochloric acid) Solution: S (alkali) V (neutralize) O (nitric acid)
In the examples above S stands for “a subject” or a Solver of the problem. The algorithm for sorting the solutions according to their type can be presented in the following two tables (for VPO and VO formats, respectively). A number of symbols used in the table should be described beforehand:
S—original term or its synonym, H—hyponyms, R—hypernyms, C—associated terms, Exact—exact matching of terms, Partial—partial matching (according to Leftmost Word Trimming algorithm), Any—Exact or Partial match.
O ∈ H-Exact
O ∈ R-Exact
P ∈ S-Any &
O ∈ SHR-Partial
O ∈ C-Any
O ∈ H-Exact
O ∈ R-Exact
O ∈ SHR-Partial
O ∈ C-Any
It is deemed necessary to comment on these tables. Let us take, for example, row “Analogous” from Table 1: column “Verb” says “S-Exact”, which means that the Verb field can contain a synonym (S) for the incoming (non-trimmed (Exact)) Verb. The same is relevant to Parameter field, except that the synonyms can be found for a trimmed or non-trimmed Parameter. Also, the Object field may contain any term obtained via semantic expansion (SHRC-Any). Eventually, “Extra condition” column says, that Object (trimmed or not, that is, “Any”) must contain an associated term (C).
As it is seen, there are two (2) “General” rows in the table. General (1) refers to those solutions, obtained only from semantic expansions of non-trimmed original term. General (2) refers to those solutions, obtained with use of Leftmost Word Trimming algorithm.
The idea of Leftmost Word Trimming algorithm is as follows. In case the exact match for the incoming term is not found in Ontology KB, the leftmost word is deleted and the remaining part of the term is again sought for in the Ontology KB. This process is repeated until the match is found or there is only one word from the original term remains. In either case, the trimmed versions of the original term are referred to as having more general meaning, as compared to the complete, non-trimmed original term. For example: “photosensitive resin composition”—is trimmed to—“resin composition”—is trimmed to—“composition”.
A computer and/or communications system may be used for several components of the system in an exemplary embodiment of the present invention. An exemplary embodiment may include a computer as may be used for several computing devices such as, e.g., but not limited to, the knowledge bases of the exemplary embodiments of the present invention. The computer may include, but is not limited to: e.g., any computer device, or communications device including, e.g., a personal computer (PC), a workstation, a mobile device, a phone, a handheld PC, a personal digital assistant (PDA), a thin client, a fat client, a network appliance, an Internet browser, a paging, or alert device, a television, an interactive television, a receiver, a tuner, a high definition (HD) television, an HD receiver, a video-on-demand (VOD) system, a server, or other device.
The computer, in an exemplary embodiment, may include a central processing unit (CPU) or processor, which may be coupled to a bus. Processor may, e.g., access main memory via bus. The computer may be coupled to an Input/Output (I/O) subsystem such as, e.g., a network interface card (NIC), or a modem for access to a network. Computer may also be coupled to a secondary memory directly via bus, or via main memory, for example. Secondary memory may include, e.g., a disk storage unit or other storage medium. Exemplary disk storage units may include, but are not limited to, a magnetic storage device such as, e.g., a hard disk, an optical storage device such as, e.g., a write once read many (WORM) drive, or a compact disc (CD), or a magneto optical device. Another type of secondary memory may include a removable disk storage device, which may be used in conjunction with a removable storage medium, such as, e.g. a CD-ROM, or a floppy diskette. In general, the disk storage unit may store an application program for operating the computer system referred to commonly as an operating system. The disk storage unit may also store documents of a database (not shown). The computer may interact with the I/O subsystems and disk storage unit via bus. The bus may also be coupled to a display for output, and input devices such as, but not limited to, a keyboard and a mouse or other pointing/selection device.
In this document, the terms “computer program medium” and “computer readable medium” may be used to generally refer to media such as, e.g., but not limited to removable storage drive, a hard disk installed in hard disk drive, and signals, etc. These computer program products may provide software to computer system. The invention may be directed to such computer program products.
References to “one embodiment,” “an embodiment,” “example embodiment,” “exemplary embodiments,” “various embodiments,” etc., may indicate that the embodiment(s) of the invention so described may include a particular feature, structure, or characteristic, but not every embodiment necessarily includes the particular feature, structure, or characteristic. Further, repeated use of the phrase “in one embodiment,” or “in an exemplary embodiment,” do not necessarily refer to the same embodiment, although they may.
In the following description and claims, the terms “coupled” and “connected,” along with their derivatives, may be used. It should be understood that these terms are not intended as synonyms for each other. Rather, in particular embodiments, “connected” may be used to indicate that two or more elements are in direct physical or electrical contact with each other. “Coupled” may mean that two or more elements are in direct physical or electrical contact. However, “coupled” may also mean that two or more elements are not in direct contact with each other, but yet still co-operate or interact with each other.
An algorithm is here, and generally, considered to be a self-consistent sequence of acts or operations leading to a desired result. These include physical manipulations of physical quantities. Usually, though not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated. It has proven convenient at times, principally for reasons of common usage, to refer to these signals as bits, values, elements, symbols, characters, terms, numbers or the like. It should be understood, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities.
Unless specifically stated otherwise, as apparent from the following discussions, it is appreciated that throughout the specification discussions utilizing terms such as “processing,” “computing,” “calculating,” “determining,” or the like, refer to the action and/or processes of a computer or computing system, or similar electronic computing device, that manipulate and/or transform data represented as physical, such as electronic, quantities within the computing system's registers and/or memories into other data similarly represented as physical quantities within the computing system's memories, registers or other such information storage, transmission or display devices.
In a similar manner, the term “processor” may refer to any device or portion of a device that processes electronic data from registers and/or memory to transform that electronic data into other electronic data that may be stored in registers and/or memory. A “computing platform” may comprise one or more processors.
Embodiments of the present invention may include apparatuses for performing the operations herein. An apparatus may be specially constructed for the desired purposes, or it may comprise a general purpose device selectively activated or reconfigured by a program stored in the device.
While various embodiments of the present invention have been described above, it should be understood that they have been presented by way of example only, and not limitation. Thus, the breadth and scope of the present invention should not be limited by any of the above-described exemplary embodiments, but should be defined only in accordance with the following claims and their equivalents. While this invention has been particularly described and illustrated with reference to a preferred embodiment, it will be understood to those having ordinary skill in the art that changes in the above description or illustrations may be made with respect to formal detail without departing from the spirit and scope of the invention.
|Cited Patent||Filing date||Publication date||Applicant||Title|
|US5581663 *||Apr 22, 1994||Dec 3, 1996||Ideation International Inc.||Automated problem formulator and solver|
|US5822731||Sep 15, 1995||Oct 13, 1998||Infonautics Corporation||Adjusting a hidden Markov model tagger for sentence fragments|
|US5940821 *||May 21, 1997||Aug 17, 1999||Oracle Corporation||Information presentation in a knowledge base search and retrieval system|
|US6161084 *||Aug 3, 1999||Dec 12, 2000||Microsoft Corporation||Information retrieval utilizing semantic representation of text by identifying hypernyms and indexing multiple tokenized semantic structures to a same passage of text|
|US6167370 *||May 27, 1999||Dec 26, 2000||Invention Machine Corporation||Document semantic analysis/selection with knowledge creativity capability utilizing subject-action-object (SAO) structures|
|US6363378||Oct 13, 1998||Mar 26, 2002||Oracle Corporation||Ranking of query feedback terms in an information retrieval system|
|US6460034||May 21, 1997||Oct 1, 2002||Oracle Corporation||Document knowledge base research and retrieval system|
|US6480843||Nov 3, 1998||Nov 12, 2002||Nec Usa, Inc.||Supporting web-query expansion efficiently using multi-granularity indexing and query processing|
|US6498795||May 28, 1999||Dec 24, 2002||Nec Usa Inc.||Method and apparatus for active information discovery and retrieval|
|US6523026 *||Oct 2, 2000||Feb 18, 2003||Huntsman International Llc||Method for retrieving semantically distant analogies|
|US6675159||Jul 27, 2000||Jan 6, 2004||Science Applic Int Corp||Concept-based search and retrieval system|
|US6694331 *||Jun 8, 2001||Feb 17, 2004||Knowledge Management Objects, Llc||Apparatus for and method of searching and organizing intellectual property information utilizing a classification system|
|US7392238 *||Aug 23, 2000||Jun 24, 2008||Intel Corporation||Method and apparatus for concept-based searching across a network|
|US7536368 *||Nov 26, 2003||May 19, 2009||Invention Machine Corporation||Method for problem formulation and for obtaining solutions from a database|
|US7548910 *||Jan 28, 2005||Jun 16, 2009||The Regents Of The University Of California||System and method for retrieving scenario-specific documents|
|US20010003183 *||Dec 7, 2000||Jun 7, 2001||James Parker||Method and apparatus for knowledgebase searching|
|US20020107844||May 8, 2001||Aug 8, 2002||Keon-Hoe Cha||Information generation and retrieval method based on standardized format of sentence structure and semantic structure and system using the same|
|US20020116169||Dec 18, 2000||Aug 22, 2002||Xerox Corporation||Method and apparatus for generating normalized representations of strings|
|US20020147578||Sep 13, 2001||Oct 10, 2002||Lingomotors, Inc.||Method and system for query reformulation for searching of information|
|US20030177112||Jan 28, 2003||Sep 18, 2003||Steve Gardner||Ontology-based information management system and method|
|US20040083092 *||Aug 8, 2003||Apr 29, 2004||Valles Luis Calixto||Apparatus and methods for developing conversational applications|
|JP2000222436A||Title not available|
|WO2003030025A1||Sep 30, 2002||Apr 10, 2003||British Telecommunications Public Limited Company||Database management system|
|Citing Patent||Filing date||Publication date||Applicant||Title|
|US8135730 *||Jun 9, 2009||Mar 13, 2012||International Business Machines Corporation||Ontology-based searching in database systems|
|US8291319 *||Aug 28, 2009||Oct 16, 2012||International Business Machines Corporation||Intelligent self-enabled solution discovery|
|US8380731 *||Dec 13, 2007||Feb 19, 2013||The Boeing Company||Methods and apparatus using sets of semantically similar words for text classification|
|US8386489 *||Nov 7, 2008||Feb 26, 2013||Raytheon Company||Applying formal concept analysis to validate expanded concept types|
|US8447640 *||Sep 13, 2006||May 21, 2013||Yedda, Inc.||Device, system and method of handling user requests|
|US8463808||Nov 7, 2008||Jun 11, 2013||Raytheon Company||Expanding concept types in conceptual graphs|
|US8577924||Dec 15, 2008||Nov 5, 2013||Raytheon Company||Determining base attributes for terms|
|US8965915||Oct 18, 2013||Feb 24, 2015||Alation, Inc.||Assisted query formation, validation, and result previewing in a database having a complex schema|
|US8996559||Oct 18, 2013||Mar 31, 2015||Alation, Inc.||Assisted query formation, validation, and result previewing in a database having a complex schema|
|US9002857 *||Aug 13, 2009||Apr 7, 2015||Charite-Universitatsmedizin Berlin||Methods for searching with semantic similarity scores in one or more ontologies|
|US9087084 *||Sep 14, 2012||Jul 21, 2015||Google Inc.||Feedback enhanced attribute extraction|
|US9087293||Dec 23, 2008||Jul 21, 2015||Raytheon Company||Categorizing concept types of a conceptual graph|
|US9104660||Nov 30, 2012||Aug 11, 2015||International Business Machines Corporation||Attribution using semantic analysis|
|US9141605||Sep 24, 2014||Sep 22, 2015||International Business Machines Corporation||Attribution using semantic analysis|
|US9158838||Dec 15, 2008||Oct 13, 2015||Raytheon Company||Determining query return referents for concept types in conceptual graphs|
|US9244952||Oct 18, 2013||Jan 26, 2016||Alation, Inc.||Editable and searchable markup pages automatically populated through user query monitoring|
|US9330176 *||Nov 14, 2012||May 3, 2016||Sap Se||Task-oriented search engine output|
|US9336290||Jul 2, 2015||May 10, 2016||Google Inc.||Attribute extraction|
|US9361365 *||Jun 16, 2011||Jun 7, 2016||Primal Fusion Inc.||Methods and apparatus for searching of content using semantic synthesis|
|US9721020 *||Jul 31, 2013||Aug 1, 2017||International Business Machines Corporation||Search query obfuscation via broadened subqueries and recombining|
|US9721023 *||Dec 12, 2014||Aug 1, 2017||International Business Machines Corporation||Search query obfuscation via broadened subqueries and recombining|
|US9734130||Jun 11, 2015||Aug 15, 2017||International Business Machines Corporation||Attribution using semantic analysis|
|US9799040||Mar 27, 2012||Oct 24, 2017||Iprova Sarl||Method and apparatus for computer assisted innovation|
|US20080235005 *||Sep 13, 2006||Sep 25, 2008||Yedda, Inc.||Device, System and Method of Handling User Requests|
|US20090157611 *||Dec 13, 2007||Jun 18, 2009||Oscar Kipersztok||Methods and apparatus using sets of semantically similar words for text classification|
|US20100121884 *||Nov 7, 2008||May 13, 2010||Raytheon Company||Applying Formal Concept Analysis To Validate Expanded Concept Types|
|US20100153367 *||Dec 15, 2008||Jun 17, 2010||Raytheon Company||Determining Base Attributes for Terms|
|US20100153369 *||Dec 15, 2008||Jun 17, 2010||Raytheon Company||Determining Query Return Referents for Concept Types in Conceptual Graphs|
|US20100161669 *||Dec 23, 2008||Jun 24, 2010||Raytheon Company||Categorizing Concept Types Of A Conceptual Graph|
|US20100287179 *||Nov 7, 2008||Nov 11, 2010||Raytheon Company||Expanding Concept Types In Conceptual Graphs|
|US20100312779 *||Jun 9, 2009||Dec 9, 2010||International Business Machines Corporation||Ontology-based searching in database systems|
|US20110040766 *||Aug 13, 2009||Feb 17, 2011||Charité-Universitätsmedizin Berlin||Methods for searching with semantic similarity scores in one or more ontologies|
|US20110040774 *||Aug 14, 2009||Feb 17, 2011||Raytheon Company||Searching Spoken Media According to Phonemes Derived From Expanded Concepts Expressed As Text|
|US20110055699 *||Aug 28, 2009||Mar 3, 2011||International Business Machines Corporation||Intelligent self-enabled solution discovery|
|US20110119217 *||Sep 30, 2010||May 19, 2011||Electronics And Telecommunications Research Institute||Apparatus and method for recommending service|
|US20110314006 *||Jun 16, 2011||Dec 22, 2011||Primal Fusion Inc.||Methods and apparatus for searching of content using semantic synthesis|
|US20120278102 *||Mar 21, 2012||Nov 1, 2012||Clinithink Limited||Real-Time Automated Interpretation of Clinical Narratives|
|US20150039579 *||Jul 31, 2013||Feb 5, 2015||International Business Machines Corporation||Search query obfuscation via broadened subqueries and recombining|
|US20150100564 *||Dec 12, 2014||Apr 9, 2015||International Business Machines Corporation||Search query obfuscation via broadened subqueries and recombining|
|US20150127652 *||Oct 30, 2014||May 7, 2015||Verint Systems Ltd.||Labeling/naming of themes|
|WO2013144220A1||Mar 27, 2013||Oct 3, 2013||Iprova Sarl||A method and apparatus for computer assisted innovation|
|U.S. Classification||706/55, 706/45|
|Apr 17, 2008||AS||Assignment|
Owner name: IWINT INTERNATIONAL HOLDINGS INC., VIRGIN ISLANDS,
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:ZHANG, GUOMING;REEL/FRAME:020814/0546
Effective date: 20080415
Owner name: IWINT INTERNATIONAL HOLDINGS INC.,VIRGIN ISLANDS,
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:ZHANG, GUOMING;REEL/FRAME:020814/0546
Effective date: 20080415
|Apr 2, 2012||AS||Assignment|
Owner name: BEIJING IWINT TECHNOLOGY LTD., CHINA
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:IWINT INTERNATIONAL HOLDINGS INC.;REEL/FRAME:027971/0954
Effective date: 20120201
|May 16, 2012||AS||Assignment|
Owner name: BEIJING IWINT TECHNOLOGY LTD., CHINA
Free format text: CORRECTIVE ASSIGNMENT TO CORRECT THE ADDRESS OF THE ASSIGNEE PREVIOUSLY RECORDED ON REEL 027971 FRAME 0954. ASSIGNOR(S) HEREBY CONFIRMS THE ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:IWINT INTERNATIONAL HOLDINGS INC.;REEL/FRAME:028216/0556
Effective date: 20120201
|Sep 23, 2013||FPAY||Fee payment|
Year of fee payment: 4
|Sep 25, 2017||MAFP|
Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YR, SMALL ENTITY (ORIGINAL EVENT CODE: M2552)
Year of fee payment: 8