WO1998057278A1 - Database query system and method - Google Patents

Database query system and method Download PDF

Info

Publication number
WO1998057278A1
WO1998057278A1 PCT/US1998/011199 US9811199W WO9857278A1 WO 1998057278 A1 WO1998057278 A1 WO 1998057278A1 US 9811199 W US9811199 W US 9811199W WO 9857278 A1 WO9857278 A1 WO 9857278A1
Authority
WO
WIPO (PCT)
Prior art keywords
key values
database
value
pair
key
Prior art date
Application number
PCT/US1998/011199
Other languages
French (fr)
Inventor
Arnold N. Blinn
Michael S. Lorton
Original Assignee
Microsoft Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft Corporation filed Critical Microsoft Corporation
Priority to AU78065/98A priority Critical patent/AU7806598A/en
Priority to JP50271199A priority patent/JP4425355B2/en
Priority to EP98926166A priority patent/EP1010101A1/en
Publication of WO1998057278A1 publication Critical patent/WO1998057278A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • G06F16/2448Query languages for particular applications; for extensibility, e.g. user defined types
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • G06F16/24578Query processing with adaptation to user needs using ranking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99932Access augmentation or optimizing
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching
    • Y10S707/99934Query formulation, input preparation, or translation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching
    • Y10S707/99935Query augmenting and refining, e.g. inexact access

Definitions

  • This invention relates to methods of searching and browsing databases using remote clients.
  • a large database application will typically support a feature whereby a user may enter a search key that only approximates the actual value for which the user is searching. Frequently, this is done using a "wildcard” character such as "*".
  • the "*" character represents all possible character strings. Accordingly, a search for "jon*” will retrieve any values beginning with the string “jon” followed by any other string of characters. "Jon”, “Jones”, and “Jonathan” would all be retrieved as a result of using "jon*" as a search key.
  • the advantages of this search method are that it is flexible, easy to explain to users, and easy to implement. However, the method is somewhat limited.
  • this method allows users to unwittingly or maliciously submit overly broad queries — queries so broad that an unacceptably large portion of the database is returned. For example, a user might type a query such as "a*", which would return all values starting with "a”.
  • the prior art includes a database lookup scheme that addresses these problems to some degree.
  • two tables were maintained in addition to the actual database.
  • a first table had a plurality of rows, each corresponding to an individual set of data records. Each row contained fields for three values: a first value, a last value, and an ID.
  • the first and last values defined sets or pages of data, similarly to the index entries at the top of a telephone directory or dictionary.
  • a data page corresponding to a particular table entry would include data values in the range between the first value and the last value, inclusive. This data page was placed in another table that was pre-generated.
  • a corresponding table (referred to as a page definition table) might appear as shown in Table 1 below:
  • an information page having ID 0 would include 10 names, beginning with “Ableson” and ending with “Baxter.” Page 1 would have the next 10 names, beginning with “Beaver” and ending with “Bowman.”
  • the page definition table was created so that each page had an approximately equal number of names.
  • a second table was also maintained. This table had actual data from the database, arranged by ID number. In the example, such a table (referred to as a page data table) would have appeared as follows:
  • the ID value of this table corresponded to the ID value of the page definition table (Table 1).
  • the PAGEDATA values comprised the actual data that was to be returned to a user for particular page, taken from the database.
  • the data consists merely of the names themselves. In actual applications, other data might also have been included, such as telephone numbers or addresses.
  • To perform a search under this scheme a user would enter the beginning portion of a name, or some string that the user believed approximated the name. Suppose, for example, that the user entered "fie" as a search string. This string would be used to query Table 1, to find the ID for a row spanning the search string.
  • the search might appear as follows: select ID from page_defini tion_table where ⁇
  • a user submits a search value to a server or service provider.
  • the search value is the user's best guess regarding the data that is sought. Wildcards are not required to be entered by users.
  • the server maintains a database in a conventional format.
  • the database has a plurality of records, each including a plurality of fields. At least one field of each record is a key field, used to search the records of the database. Standard queries can be performed against the database using specified key values.
  • the server also maintains a plurality of page definitions in a page table.
  • Each page definition defines a page of database records in terms of lower and upper key values.
  • each page definition indicates a single pair of lower and upper key values.
  • the server examines the page definitions to find one having key values that surround the search value. It then uses these key values to search the database, and to find all records whose key values fall between the key values of the page definition.
  • the upper and lower key values of the page definitions do not specify the first and last key values of the database records that are to be returned to the user. Rather, the lower key value specifies the key value of the first database record to be returned, while the upper key value specifies the key value of the first database record of a subsequent page of records.
  • each page definition has two pairs of key values: a first pair that is used to identify the page definition corresponding to a search value, and a second pair that is used to query the database once the correct page definition has been found.
  • the information returned to the user does not need to correspond in range to the index scheme used to identify page definitions.
  • the first pair defines a range of key values that is a subset of the range defined by the second pair.
  • Fig. 1 is a block diagram of an information retrieval system in accordance with the invention.
  • Fig. 2 is a block diagram showing pertinent components of a client or server computer in accordance with the invention.
  • Fig. 3 is a flowchart showing preferred steps in accordance with the invention.
  • Fig. 1 shows an information retrieval computer system in accordance with one embodiment of the invention, generally designated by reference numeral 10.
  • Information retrieval system 10 includes a server 12 and a plurality of clients 14. Communications between the clients and the server is provided through a " data communications network such as a local-area network, a wide-area network, or a public network such as the Internet.
  • a data communications network such as a local-area network, a wide-area network, or a public network such as the Internet.
  • Server 12 has or has access to one or more data storage media such as hard disks or hard disk arrays.
  • the server maintains a database 16 on one of these hard disks.
  • the database is formatted and arranged in a conventional format, having a plurality of database records. Each record has a plurality of fields containing variable data. The records can be searched and or sorted based on values contained in one or more key fields. Values in key fields are referred to herein as key values.
  • the server has a database management system (DBMS) 18 for manipulating the database. Queries can be submitted to the DBMS in SQL (standard query language) format or a similar language that allows access to a relational database.
  • SQL standard query language
  • Fig. 2 shows pertinent hardware components of an example client computer generally referenced by numeral 40.
  • Computer 40 can be any one of a variety of different types of devices including consumer appliances such as set-top boxes, handheld or pocket computers, Internet terminals, desktop computers, and laptop or notebook computers.
  • a server computer is configured similarly to the configuration of Fig. 2, although it is optimized for server functions such as file access, application sharing, and database access.
  • Computer 40 includes a microprocessor 41 and a computer-readable storage medium such as electronic memory 42.
  • Processor 40 is a conventional, off-the-shelf microprocessor such as an x86 series microprocessor, manufactured by Intel Corporation of Santa Clara, California.
  • Computer system 40 also includes other typical computer hardware, such as I/O component 38 including interface electronics, as well as multimedia rendering devices 39 such as audio and video components.
  • Physical memory 42 preferably comprises randomly accessible read/write electronic memory (RAM), as well as secondary storage such as a magnetic hard disk or other data recording media.
  • Memory 42 might also include other types of computer-readable storage media on which executable components are distributed and stored, including floppy disks, CD-ROMs, and non-volatile electronic memory such as EPROMs.
  • Database 16 can be stored in memory 42, or in other disk-based memory that is accessible only through network resources.
  • computer system 40 includes an operating system 43 and one or more application programs 44 that execute in conjunction with the operating system.
  • the Windows 95® operating system available from Microsoft Corporation of Redmond, Washington, is an example of an operating system suited for use in a client computer.
  • Microsoft's Windows NT® operating system is an example of an operating system suitable for a server computer.
  • a plurality of application programs typically execute from memory 42.
  • active application programs include an Internet browser such as Microsoft's Internet ExplorerTM.
  • active applications include a database management system such as SQL*Server, Oracle, and Internet Information Server.
  • the invention includes a method of finding data in the server-maintained database 16.
  • one of clients 14 submits a search value to server 12.
  • the search value is of the information type contained in one of the key fields of database records.
  • the server is responsive to the submission of the search value to identify and return data from a limited set of database records which are likely to most closely match the submitted search value.
  • Such a set of database records is referred to as a page of database records, and the data return from " this record page is referred to as a data page.
  • the server To identify a single record page in response to a search value, the server maintains and references a plurality of page definitions. Each page definition indicates at least one pair of key values. When given a particular search value, the server performs a step of identifying one of the page definitions that corresponds to the search value. The server then searches the database to find a page or set of database records whose key values are bounded by the key values indicated by the identified page definition. Data from these records is then returned to the client as a data page. The records are optionally sorted before they are returned to the user. For instance, names might be sorted alphabetically before returning them to the user.
  • the page definitions are preferably stored in a searchable table on a storage media accessible by the server and serviced by the DBMS and referred to herein as a page definition table.
  • a page definition table Such a table is designated by reference numeral 20 in Fig. 1.
  • the page definition table has a plurality of individual entries or rows. Each entry contains a pair of key values: a LOWER key value and an UPPER key value. In this embodiment of the invention, these key values are used as both index key values and as bounding key values. The difference between these types of key values will become apparent in the discussion below relating to a second embodiment of the invention.
  • Table 3 illustrates an example of a page definition table that corresponds to the example database used in the discussion above:
  • Each entry of the page definition table corresponds to a set of database records having key values that are bounded by the LOWER and UPPER key values of the entry. More specifically, a single set of database records corresponding to a particular entry of the page definition table comprises all records having a key values that are (a) equal to or greater than the LOWER key value of the entry and (b) less than the UPPER key value of the entry.
  • the first entry has a LOWER key value of "a” because "a” is the lexicographically first possible key value in this system (this value will differ depending on the type of data). It has an UPPER key value of "Beaver.”
  • a set of database records corresponding to this entry thus includes all names prior to Beaver in an alphabetical list.
  • a set of database records corresponding to the second entry includes all names in the range between "Beaver” and "Bowman” ("Bowman” is the name in the database that precedes "Bratsanos”).
  • the LOWER key value for the first entry is a value that is equal to or less than any possible search value.
  • the UPPER key value for the last entry of the table is a value that is equal to or greater than any possible search value (again, this value will differ depending on the type of data contained in the database).
  • the page definition table is defined based on a desired size and division of information. With alphabetical information, it might be desired to have a single page of information for each letter of the alphabet, which would eliminate the need for the page definition table. However, it will often be desired to have coarser or finer divisions, so that an appropriate or approximately equal amount of data will correspond to each page table entry. This is especially true for large data sets. Good examples of simple page definition tables are the tabs of dictionaries or address books (A-B XYZ etc).
  • this table is different than the table shown above in Table 1.
  • Table 3 does not specify an ID number. Just the LOWER and UPPER key values are specified. In addition, the UPPER key value of each row is the same as the LOWER value for the next row. This prevents certain search values from falling between table entries — any possible and allowed search value will fall within the key values of one of the entries.
  • the server searches the page definition table to identify a row or entry that corresponds to the search value. Specifically, the server finds the entry having LOWER and UPPER key values that surround the search value. Even more specifically, the server finds the single entry for which it is true that (a) the search value is equal to or greater than the LOWER key value of the entry and (b) the search value is strictly less than the UPPER key value of the entry.
  • the server performs a search or query of the database to identify the set of database records whose key values are bounded by the key values of the identified table entry.
  • This set comprises all records having a key values that are (a) equal to or greater than the LOWER key value of the entry and (b) less than the UPPER key value of the entry. Data "from these records is returned as a response to the client's request for data.
  • the database will in most cases have a plurality of fields other than the key fields.
  • the values returned to the client are not necessarily the key values from the identified set of database records. Rather, any group of field values can be returned from the set of database records once the appropriate set of database records is identified. In most cases, however, the group of field values returned for each record will include values from the key field.
  • the server returns the LOWER and UPPER key values that were used to identify the set of database records. This state information is then stored by the client, and used by the client to formulate subsequent queries or requests for finding additional sets or pages of database records that neighbor the set that was initially found.
  • a step 100 comprises submitting a search value from a server to a client.
  • the search value corresponds to a key field in a database maintained by the server.
  • the server performs a step 102 of referencing a plurality of page definitions to identify one of the page definitions that corresponds to the supplied search value.
  • each page definition indicates a pair of key values, wherein each pair includes an UPPER key value and a LOWER key value.
  • the page definitions are preferably stored as entries in a table.
  • step 102 comprises searching for a table entry whose pair of key values surrounds the supplied search value — where the search value falls between a range defined by the LOWER and UPPER key values, inclusive of the LOWER key value.
  • a step 104 comprises searching the database to find a set of database records whose key values are bounded by the key values of the page definition identified in step 102. Specifically, this step is formulated to return all records having key values between the LOWER and UPPER key values of the identified page definition, inclusive of the LOWER key value.
  • a step 106 comprises returning data from the identified set of database records to the client. This step includes returning the key values from the page definition identified in step 102.
  • the client can derive new search values based on the returned key values. Such new search values are submitted to the server in order to browse through different page of database records.
  • the latter of the two preceding query statements can return any data contained in the database records, without requiring regeneration of any tables. There is no need to predefine the returned fields, as is the case in the prior art.
  • the query is formulated to return a full name and address for each record in the selected page of database records.
  • queries can be formulated and submitted by the server, or they can be formulated and submitted directly from a client.
  • the queries are performed by the server, and data is returned in HTML format.
  • LOWER previous_upper
  • the closest matches to the search value can be positioned consistentl near the beginning, near the middle, or near the end of the returned data, although typically it would be desired to display the data near the beginning. For example, once a targeted data page is determined, that data page and the following data page could be returned. For further user convenience, it is also desirable to highlight the closest match to the user-specified key value.
  • the page definitions can be configured to include two pairs of key values: a pair of index key values and a pair of bounding key values.
  • the index key values are used to find the page definition corresponding to the search value, and the bounding key values are used to define the bounds of database records to be returned to the client.
  • This alternative scheme can also be used to position the closest matching records near arbitrary location in the returned data, such as near the beginning, middle, or end.
  • Table 4 illustrates a page definition table in accordance with this alternative embodiment of the invention:
  • Each page definition table entry includes both a pair of index key values and a pair of bounding key values.
  • the index key values are used when identifying which table entry corresponds to a search value. Specifically, the entry whose LOWER index key value is equal to or less than the search value and whose UPPER index key value is greater than the search value is the entry that corresponds to the search value.
  • the index key values are configured exactly as in Table 3: the pairs form an ordered sequence, wherein the UPPER key value of a particular pair is the same as the LOWER key value of a subsequent pair. This prevents search values from falling between pairs of index key values.
  • the bounding key values are used when querying the database for a set of records. Once a table entry is identified, the database is queried for all records whose key values fall between the bounding entries of the identified table entry.
  • the bounding values of a particular table entry preferably span the index entries of the same entry.
  • the index key values are "Bratsanos" and "Cheng.” Search values between these values (inclusive of "Bratsanos") will result in this table entry being selected. However, the data returned will correspond to all database records whose key values are between "Borgman” and "Conway.” Thus, the bounding range is larger than the index range for any particular entry.
  • the bounding key values for a particular entry might be identical to the index key values for the entry. This special case is actually implemented by the previous example of Table 3; where the index key values and the bounding key values are equal to each other, only one pair of values needs to be listed in the page definition table.
  • data that most closely matches the search value can be biased toward one portion of the search results. For example, if the LOWER bounding key value is equal to the LOWER index key value and the UPPER bounding key value is much greater than the UPPER index key value, the closest matches will be biased toward the beginning of the returned data.
  • the client When returning data, either the bounding keys, the index keys, or both are returned to the client.
  • the client can then submit a new search criterion to the server, based on these values, to browse to a different set of database records as described above.
  • returned index key values are used to formulate a new search query.
  • the server can control the position of the searched word in the data returned to the client. If it is desired for the key value (or its closest match) to be the first record returned, earlier records can simply be omitted from the returned search results. In other words, the earlier records can be "truncated" off of the results of the second query before returning them to the client. In order to do this, selecting the "previous" records must return those records that were truncated off. So the first request for a previous set of records would probably not return the previous set, but rather the original untruncated first set.

Abstract

The invention includes an information retrieval system having a server and a plurality of clients. The server has access to one or more data storage media on which the server maintains a database. The database has a plurality of records that can be searched based on key values. The server maintains a page definition table in addition to the database. The page definition table has a plurality of entries, each indicating pairs of key values. To perform an information search, a client submits a search value to the server. The server responds by identifying an entry from the page definition table that corresponds to the search value. The server then identifies a set of database records whose key values are bounded by the pair of key values of the identified entry. Data from these database records are returned as search results to the client.

Description

DATABASE QUERY SYSTEM AND METHOD
TECHNTCAL FTELD
This invention relates to methods of searching and browsing databases using remote clients.
BACKGROUND OF THE INVENTION
A large database application will typically support a feature whereby a user may enter a search key that only approximates the actual value for which the user is searching. Frequently, this is done using a "wildcard" character such as "*". The "*" character represents all possible character strings. Accordingly, a search for "jon*" will retrieve any values beginning with the string "jon" followed by any other string of characters. "Jon", "Jones", and "Jonathan" would all be retrieved as a result of using "jon*" as a search key. The advantages of this search method are that it is flexible, easy to explain to users, and easy to implement. However, the method is somewhat limited. For example, if the values returned from a search do not contain the value sought, there is no obvious way to expand or modify the query to find "nearby" values. In addition, this method allows users to unwittingly or maliciously submit overly broad queries — queries so broad that an unacceptably large portion of the database is returned. For example, a user might type a query such as "a*", which would return all values starting with "a".
Now that databases are increasingly being queried over the Internet, it is becoming even more important to return search results to users in relatively small portions. Some service providers indeed provide search results in measured lengths. For example, only the first 20 hits of a search might be returned, and the user given the option of then requesting the next 20 hits. However, such service providers often initiate a completely new search when the next 20 hits are requested, and then Select the second group of 20 hits from the results. This is because of the impracticality of maintaining state information at the server for the thousands of users that might be accessing a database at any given time. Hence a query that returns many sorted rows would be executed completely for each request to the server. This consumes a significant amount of server resources, for searching, sorting, and transmitting data to the client.
The prior art includes a database lookup scheme that addresses these problems to some degree. In accordance with this prior art scheme, two tables were maintained in addition to the actual database. A first table had a plurality of rows, each corresponding to an individual set of data records. Each row contained fields for three values: a first value, a last value, and an ID. The first and last values defined sets or pages of data, similarly to the index entries at the top of a telephone directory or dictionary. A data page corresponding to a particular table entry would include data values in the range between the first value and the last value, inclusive. This data page was placed in another table that was pre-generated.
Consider, as an example, a database of last names as follows:
Ableson Ahlquist Applegate Ashley Bachmann Bailey Bangsberg Barker Barnes Baxter Beaver Becker Beckett Bergquist Beutel Bezanson Blackwell Boldman Borgman
Bowman Bratsanos Braum Brown Burghardt Butler Cantwell Carlstrom Carolan Caudill Chen Cheng Chung Conway Crawford Cunneely Cutler Davidson Davies Davis Day De Donato DeBragga DeBroeck DeNike Dosch Douglas Doyle Drage Driscoll Dugan Dulfer Duncan Dunkelberger Dunn
Durham Edwards Elder Ellis Evans Ewing Fang Fay Ferguson Ferris Ferry Fine Fitzgibbons Flippin Flowers Fort Francoeur Freeman Funaro Galasyn Garrity Gilroy Goodman Gordon Gower Grady Gui Guo Gwertzman Harbin Harkness Hart Hartill Haygood Heflin Helfrich Hersey Hogan Honey cutt Hornstein
Hunter Huse Jackson Jewiss Jones Joyner Kearney Kelly Kelnhofer King Kinsella Kramer Kreider Labyak Lalonde Laurie Lee Leptich Lim Lindell Livingston Loggins Lorenzana Lowe Ludden Lutz Lyons Maccalman Maes Mah Mahoney Malugen Martin Mastan Maumas Maxey Mc Mahon McCann McCrory McDaniel McGuire Mercure Meyers Miao Milford Miller Moffitt Monberg Moncure Monroe Mooney Moore Morrison Murray Naroski Newton O'Connor O'Sullivan Oker Packham Perabo Peters Phan Pipkins Plamondon Prekeges Prets Quach Raines Ranch Rather Reid Reynolds Rice Richardson Rivera Roberts Robinson Rodrigues Rutledge Saunders Shamblin Simmons Simpson Slonsky Spahn St. Clair Stulz Sturm Sturms Teply Theivagt Thomson Tierney Timoney Titus Troupe Tucker Tutt Twyman Tyner Utzschneider Naile Nisintainer Walker Ward Watson Webb Webb Whitaker White Wilcher Williams Wilson Wolff Yenne Younkin
A corresponding table (referred to as a page definition table) might appear as shown in Table 1 below:
FIRST LAST ID
Ableson Baxter 0
Beaver Bowman 1
Bratsanos Chen 2
Cheng Day 3 De Donato Dugan 4
Dulfer Ewing 5
Fang Fort 6
Francoeur Grady 7
Gui Helfrich 8
Hersey Joyner 9
Kearney Laurie 10
Lee Lutz 11
Lyons Maxey 12 McMahon Miller 13
Moffit Newton 14
O' Conner Prekeges 15
Prets Rivers 16
Roberts Spahn 17 St. Clair Titus 18
Troupe Ward 19 Watson Younkin 20
Table 1
In accordance with this table, an information page having ID 0 (page 0) would include 10 names, beginning with "Ableson" and ending with "Baxter." Page 1 would have the next 10 names, beginning with "Beaver" and ending with "Bowman." The page definition table was created so that each page had an approximately equal number of names. A second table was also maintained. This table had actual data from the database, arranged by ID number. In the example, such a table (referred to as a page data table) would have appeared as follows:
ID PAGEDATA
0 Ableson . . . Baxter
1 Beaver . . . Bowman
2 Bratsanos . . . Chen
3 Cheng . . . Day 4 De Donato . . . Dugan
5 Dulfer . . . Ewing
6 Fang . . . Fort
7 Francoeur . . . Grady 8 Gui . . . Helfrich
9 Hersey . . . Joyner
10 Kearney . . . Laurie
11 Lee . . . Lutz
12 Lyons . . . Maxey 13 McMahon . . . Miller
14 Moffit . . . Newton
15 O' Conner . . . Prekeges
16 Prets . . . Rivers
17 Roberts . . . Spahn 18 St. Clair . . . Titus
19 Troupe . . . Ward
20 Watson . . . Younkin
Table 2
The ID value of this table corresponded to the ID value of the page definition table (Table 1). The PAGEDATA values comprised the actual data that was to be returned to a user for particular page, taken from the database. In the example here, the data consists merely of the names themselves. In actual applications, other data might also have been included, such as telephone numbers or addresses. To perform a search under this scheme, a user would enter the beginning portion of a name, or some string that the user believed approximated the name. Suppose, for example, that the user entered "fie" as a search string. This string would be used to query Table 1, to find the ID for a row spanning the search string. Using SQL format, the search might appear as follows: select ID from page_defini tion_table where ~~
FIRST <= 'fie' and LAST >= 'fie' where "page_definition_table" refers to Table 1.
In this case, 6 would have been the appropriate ID, indicating that page 6 should be returned to the user. To find the appropriate data, a second query would have been performed against Table 2 using the ID number identified in the first search. An SQL query to accomplish this might have appeared as follows: select PAGEDATA from page_data_table where ID == X where where "page_data_table" refers to Table 2 and X is the ID value returned from the first query. The PAGEDATA value would have been returned to the user, along with the ID value. Using the ID value, the user would have been able to browse to previous and subsequent pages by submitting a query against Table 2 using a decremented or incremented ID.
There were several serious problems with this scheme. One problem occurred when a user submitted a search string falling between the first and last values of adjacent rows of the page definition table. A search string of "grant," for instance, would fall between rows 7 and 8 of the page definition table shown in Table 1. This would cause the search to fail, and no data would be returned to the user.
Another problem was that any insertion of new data required regenerating both of the tables. Furthermore, because of the structure of Table 2, PAGEDATA fields had to be defined before generation of the table. Changing the fields returned from a search required regeneration of the entire PAGEDATA table. A related problem was that such generation required very significant amounts of time when dealing with a database of any size. In a sample application, regeneration required over six hours of computing time.
Another problem with the prior art scheme was that the "closest" match to the user-specified search string (which was selected and highlighted for the user) would show up in search results at unpredictable offsets from the beginning of the returned data. To enhance usability, it would be more desirable to locate this data in a more predictable location, such as at the beginning of the search results or at approximately the middle of the search results. The inventors have found an improvement over the prior art methods described above, allowing users to submit queries and to receive small groups of data in a sorted format, where the closest match is approximately centered within the returned results. The invention also allows the user to easily browse from one group of values to adjacent groups. Even with these desirable features, data can be added to the database without requiring the regeneration of any tables. Furthermore, searching is very efficient, and the server is not required to store state information for users.
SUMMARY OF THE INVENTION
In accordance with the invention, a user submits a search value to a server or service provider. The search value is the user's best guess regarding the data that is sought. Wildcards are not required to be entered by users.
The server maintains a database in a conventional format. The database has a plurality of records, each including a plurality of fields. At least one field of each record is a key field, used to search the records of the database. Standard queries can be performed against the database using specified key values.
The server also maintains a plurality of page definitions in a page table. Each page definition defines a page of database records in terms of lower and upper key values. In one embodiment, each page definition indicates a single pair of lower and upper key values. In response to receiving a search value, the server examines the page definitions to find one having key values that surround the search value. It then uses these key values to search the database, and to find all records whose key values fall between the key values of the page definition.
In contrast to the prior art, the upper and lower key values of the page definitions do not specify the first and last key values of the database records that are to be returned to the user. Rather, the lower key value specifies the key value of the first database record to be returned, while the upper key value specifies the key value of the first database record of a subsequent page of records.
In another embodiment, each page definition has two pairs of key values: a first pair that is used to identify the page definition corresponding to a search value, and a second pair that is used to query the database once the correct page definition has been found. Using this embodiment, the information returned to the user does not need to correspond in range to the index scheme used to identify page definitions.
This allows the set returned to the client to be more configurable. Typically, the first pair defines a range of key values that is a subset of the range defined by the second pair.
BRIEF DESCRIPTION OF THE DRAWINGS
Fig. 1 is a block diagram of an information retrieval system in accordance with the invention. Fig. 2 is a block diagram showing pertinent components of a client or server computer in accordance with the invention.
Fig. 3 is a flowchart showing preferred steps in accordance with the invention.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT
Fig. 1 shows an information retrieval computer system in accordance with one embodiment of the invention, generally designated by reference numeral 10. Information retrieval system 10 includes a server 12 and a plurality of clients 14. Communications between the clients and the server is provided through a" data communications network such as a local-area network, a wide-area network, or a public network such as the Internet.
Server 12 has or has access to one or more data storage media such as hard disks or hard disk arrays. The server maintains a database 16 on one of these hard disks. The database is formatted and arranged in a conventional format, having a plurality of database records. Each record has a plurality of fields containing variable data. The records can be searched and or sorted based on values contained in one or more key fields. Values in key fields are referred to herein as key values. The server has a database management system (DBMS) 18 for manipulating the database. Queries can be submitted to the DBMS in SQL (standard query language) format or a similar language that allows access to a relational database.
Fig. 2 shows pertinent hardware components of an example client computer generally referenced by numeral 40. Computer 40 can be any one of a variety of different types of devices including consumer appliances such as set-top boxes, handheld or pocket computers, Internet terminals, desktop computers, and laptop or notebook computers. A server computer is configured similarly to the configuration of Fig. 2, although it is optimized for server functions such as file access, application sharing, and database access. Computer 40 includes a microprocessor 41 and a computer-readable storage medium such as electronic memory 42. Processor 40 is a conventional, off-the-shelf microprocessor such as an x86 series microprocessor, manufactured by Intel Corporation of Santa Clara, California. Although Fig. 2 shows only a single processor, the system might include a plurality of processors, used by a plurality of different processes or tasks, each having one or more execution threads. Computer system 40 also includes other typical computer hardware, such as I/O component 38 including interface electronics, as well as multimedia rendering devices 39 such as audio and video components.
Physical memory 42 preferably comprises randomly accessible read/write electronic memory (RAM), as well as secondary storage such as a magnetic hard disk or other data recording media. Memory 42 might also include other types of computer-readable storage media on which executable components are distributed and stored, including floppy disks, CD-ROMs, and non-volatile electronic memory such as EPROMs. Database 16 can be stored in memory 42, or in other disk-based memory that is accessible only through network resources. In accordance with conventional computer systems, computer system 40 includes an operating system 43 and one or more application programs 44 that execute in conjunction with the operating system. The Windows 95® operating system, available from Microsoft Corporation of Redmond, Washington, is an example of an operating system suited for use in a client computer. Microsoft's Windows NT® operating system is an example of an operating system suitable for a server computer.
A plurality of application programs typically execute from memory 42. At a client, active application programs include an Internet browser such as Microsoft's Internet Explorer™. At the server, active applications include a database management system such as SQL*Server, Oracle, and Internet Information Server.
Referring again to Fig. 1, the invention includes a method of finding data in the server-maintained database 16. In general, one of clients 14 submits a search value to server 12. The search value is of the information type contained in one of the key fields of database records. The server is responsive to the submission of the search value to identify and return data from a limited set of database records which are likely to most closely match the submitted search value. Such a set of database records is referred to as a page of database records, and the data return from" this record page is referred to as a data page.
To identify a single record page in response to a search value, the server maintains and references a plurality of page definitions. Each page definition indicates at least one pair of key values. When given a particular search value, the server performs a step of identifying one of the page definitions that corresponds to the search value. The server then searches the database to find a page or set of database records whose key values are bounded by the key values indicated by the identified page definition. Data from these records is then returned to the client as a data page. The records are optionally sorted before they are returned to the user. For instance, names might be sorted alphabetically before returning them to the user.
The page definitions are preferably stored in a searchable table on a storage media accessible by the server and serviced by the DBMS and referred to herein as a page definition table. Such a table is designated by reference numeral 20 in Fig. 1. The page definition table has a plurality of individual entries or rows. Each entry contains a pair of key values: a LOWER key value and an UPPER key value. In this embodiment of the invention, these key values are used as both index key values and as bounding key values. The difference between these types of key values will become apparent in the discussion below relating to a second embodiment of the invention.
Table 3 below illustrates an example of a page definition table that corresponds to the example database used in the discussion above:
LOWER UPPER a Beaver
Beaver Bratsanos
Bratsanos Cheng
Cheng De Donato
De Donato Dulfer
Dulfer Fang Fang Francoeur
Francoeur Gui
Gui Hersey
Hersey Kearney
Kearney Lee
Lee Lyons
Lyons McMahon
McMahon Moffit
Moffit O' Conner
O' Conner Prets
Prets Roberts
Roberts St. Clair
St. Clair Troupe
Troupe Watson
Watson zzzzzzz
Table 3
Each entry of the page definition table corresponds to a set of database records having key values that are bounded by the LOWER and UPPER key values of the entry. More specifically, a single set of database records corresponding to a particular entry of the page definition table comprises all records having a key values that are (a) equal to or greater than the LOWER key value of the entry and (b) less than the UPPER key value of the entry.
In the example Table 3 above, the first entry has a LOWER key value of "a" because "a" is the lexicographically first possible key value in this system (this value will differ depending on the type of data). It has an UPPER key value of "Beaver." A set of database records corresponding to this entry thus includes all names prior to Beaver in an alphabetical list. A set of database records corresponding to the second entry includes all names in the range between "Beaver" and "Bowman" ("Bowman" is the name in the database that precedes "Bratsanos"). Generally, the LOWER key value for the first entry is a value that is equal to or less than any possible search value. The UPPER key value for the last entry of the table is a value that is equal to or greater than any possible search value (again, this value will differ depending on the type of data contained in the database). The page definition table is defined based on a desired size and division of information. With alphabetical information, it might be desired to have a single page of information for each letter of the alphabet, which would eliminate the need for the page definition table. However, it will often be desired to have coarser or finer divisions, so that an appropriate or approximately equal amount of data will correspond to each page table entry. This is especially true for large data sets. Good examples of simple page definition tables are the tabs of dictionaries or address books (A-B XYZ etc).
Note that this table is different than the table shown above in Table 1. Table 3, for example, does not specify an ID number. Just the LOWER and UPPER key values are specified. In addition, the UPPER key value of each row is the same as the LOWER value for the next row. This prevents certain search values from falling between table entries — any possible and allowed search value will fall within the key values of one of the entries. When a client submits a search value as part of a request for data, the server searches the page definition table to identify a row or entry that corresponds to the search value. Specifically, the server finds the entry having LOWER and UPPER key values that surround the search value. Even more specifically, the server finds the single entry for which it is true that (a) the search value is equal to or greater than the LOWER key value of the entry and (b) the search value is strictly less than the UPPER key value of the entry.
Once such an entry is found, its key values are used as bounding values to identify the corresponding set of database records. Specifically, the server performs a search or query of the database to identify the set of database records whose key values are bounded by the key values of the identified table entry. This set comprises all records having a key values that are (a) equal to or greater than the LOWER key value of the entry and (b) less than the UPPER key value of the entry. Data "from these records is returned as a response to the client's request for data.
Although the page definition table lists key values, the database will in most cases have a plurality of fields other than the key fields. The values returned to the client are not necessarily the key values from the identified set of database records. Rather, any group of field values can be returned from the set of database records once the appropriate set of database records is identified. In most cases, however, the group of field values returned for each record will include values from the key field. In addition to returning the desired data, the server returns the LOWER and UPPER key values that were used to identify the set of database records. This state information is then stored by the client, and used by the client to formulate subsequent queries or requests for finding additional sets or pages of database records that neighbor the set that was initially found. This allows browsing among pages of database information based on state information maintained by the client — no server resources are consumed in order to maintain the state information. Additionally, if there are many records that fall into a range (say works that start with Beethoven includes 1000 entries in a music database and sets of 100 are returned), this technique allows the system to perform well while still allowing the user to browse the data.
Note that there is preferably some preprocessing of search values. For example, all values are converted to lowercase and leading spaces are removed. Key words such as "The" or "A" are also removed. Note that the original data might have a "key" value for certain fields. For example, if the original data was "The Beatles", it might contain an additional column "beatles, the" which is used for searching. In this example, the search key data might not be returned, but the original "actual" data would be returned instead. Nalues in the page definition table are processed in the same way. Fig. 3 illustrates preferred steps in accordance with the invention. A step 100 comprises submitting a search value from a server to a client. The search value corresponds to a key field in a database maintained by the server. In response to submission of the search value, the server performs a step 102 of referencing a plurality of page definitions to identify one of the page definitions that corresponds to the supplied search value. As already described, each page definition indicates a pair of key values, wherein each pair includes an UPPER key value and a LOWER key value. The page definitions are preferably stored as entries in a table. When this is the case, step 102 comprises searching for a table entry whose pair of key values surrounds the supplied search value — where the search value falls between a range defined by the LOWER and UPPER key values, inclusive of the LOWER key value.
A step 104 comprises searching the database to find a set of database records whose key values are bounded by the key values of the page definition identified in step 102. Specifically, this step is formulated to return all records having key values between the LOWER and UPPER key values of the identified page definition, inclusive of the LOWER key value.
A step 106 comprises returning data from the identified set of database records to the client. This step includes returning the key values from the page definition identified in step 102. The client can derive new search values based on the returned key values. Such new search values are submitted to the server in order to browse through different page of database records.
This scheme is a great improvement over the prior art schemes discussed above. The most significant advantage is that no table regeneration is needed when adding records to the database. Although adding records to the database might eventually result in record pages having differing numbers of records, this is not a significant concern. In a large database, the relative sizes will remain about equal, subject only to small statistical variations. At some point, it might be desirέd to regenerate or recalculate the page definition table in order to set all record pages to an approximately equal size. However, this is a relatively small task— especially in comparison to the task of regenerating a page data table as was required in the prior art. Additionally, when defining or regenerating the page definition table, a random sampling of database records can be used to define record sets of equal sizes. This further reduces the time required for regeneration. More importantly, a page data table is not required. Rather, reference to the database itself is made using this technique after lower and upper bounds have been identified from the page definition table. Note that searches can be based on different database fields by maintaining multiple page definition tables. This would have required multiple page data tables in the prior art.
Searches using this scheme are quick and efficient, requiring only a simple indexed lookup in the page definition table and a subsequent indexed query in the database. Example queries using "fie" as a search value would appear as follows in SQL format: select LOWER, UPPER from page_defini tion_table where LOWER <= 'FLE' and UPPER > 'FLE' select name, address from database_table where key_value >= 'FANG' and key_value < 'FRANCEOUR' where key_vαlue refers to a database field containing key values for each record, and database jtable refers to an appropriate table in the existing database. Appropriate checks for "boundary" conditions are achieved by checking for the query returning no rows. Note that the latter of the two preceding query statements can return any data contained in the database records, without requiring regeneration of any tables. There is no need to predefine the returned fields, as is the case in the prior art. In this case, the query is formulated to return a full name and address for each record in the selected page of database records.
These queries can be formulated and submitted by the server, or they can be formulated and submitted directly from a client. In the preferred embodiment, the queries are performed by the server, and data is returned in HTML format.
Another important characteristic of the invention is that it allows browsing from one data page to adjacent data pages, using state information that is maintained by the client rather than the server. This is because the key values used to formulate the query of the database are provided to the client. In the example above, for instance, the key values "Fang" and "Francoer" would have been returned. To browse to a previous set or page of data, the client would submit a new search based on one of these values. For example, new UPPER and LOWER values for a previous page of database records are found by executing a query of the form: select LOWER, UPPER from page_defini tion_table where UPPER = previous _lower
UPPER and LOWER key values for a subsequent page of database records are found as follows: select LOWER, UPPER from page_defini txon_table where LOWER = previous_upper Another problem addressed by the invention relates to the convenient presentation of returned data to the user. When the client is searching for a specific record, that record will generally be positioned randomly in the returned and sorted data. It could be at the beginning of the returned records, at the end of the returned records, or in the middle. If desired, however, the page definition table can be split into smaller entries, so that each data page is relatively small. Then, a plurality of data pages can be returned to the client, with the data page corresponding to the search value being positioned arbitrarily within the returned pages. Using this method, the closest matches to the search value can be positioned consistentl near the beginning, near the middle, or near the end of the returned data, although typically it would be desired to display the data near the beginning. For example, once a targeted data page is determined, that data page and the following data page could be returned. For further user convenience, it is also desirable to highlight the closest match to the user-specified key value.
As an enhancement to the scheme already described, the page definitions can be configured to include two pairs of key values: a pair of index key values and a pair of bounding key values. The index key values are used to find the page definition corresponding to the search value, and the bounding key values are used to define the bounds of database records to be returned to the client. In this way, the range of returned records can be extended beyond that which would otherwise be returned using the scheme described above. This alternative scheme can also be used to position the closest matching records near arbitrary location in the returned data, such as near the beginning, middle, or end.
Table 4 below illustrates a page definition table in accordance with this alternative embodiment of the invention:
INDEX INDEX BOUNDING BOUNDING
LOWER UPPER LOWER UPPER a Beaver a Beckett
Beaver Bratsanos Barnes Browns
Bratsanos Cheng Borgman Con way
Cheng De Donato Caudill DeBroeck
De Donato Dulfer Davis Dunkelberger
Dulfer Fang Driscoll Ferguson
Fang Francoeur Evans Funaro
Francoeur Gui Flowers Gwertzman
Gui Hersey Gower Honeycutt
Hersey Kearney Heflin Kelnhofer
Kearney Lee Jones Lim
Lee Lyons Lalonde Maes
Lyons McMahon Ludden McCrory
McMahon Moffit Maumas Moncure
Moffit O' Conner Milford Oker
O' Conner Prets Naroski Raines Prets Roberts Planondon Rodriquez Roberts St. Clair Richardson Sturm
St. Clair Troupe Slonsky Tutt
Troupe Watson Timoney Webb
Watson ZZZZZZZ Walker ZZZZZZZ
Table 5
Each page definition table entry includes both a pair of index key values and a pair of bounding key values. The index key values are used when identifying which table entry corresponds to a search value. Specifically, the entry whose LOWER index key value is equal to or less than the search value and whose UPPER index key value is greater than the search value is the entry that corresponds to the search value.
The index key values are configured exactly as in Table 3: the pairs form an ordered sequence, wherein the UPPER key value of a particular pair is the same as the LOWER key value of a subsequent pair. This prevents search values from falling between pairs of index key values.
The bounding key values are used when querying the database for a set of records. Once a table entry is identified, the database is queried for all records whose key values fall between the bounding entries of the identified table entry.
When using this scheme, the bounding values of a particular table entry preferably span the index entries of the same entry. Consider, for example, the third entry in Table 4. The index key values are "Bratsanos" and "Cheng." Search values between these values (inclusive of "Bratsanos") will result in this table entry being selected. However, the data returned will correspond to all database records whose key values are between "Borgman" and "Conway." Thus, the bounding range is larger than the index range for any particular entry. In some cases, the bounding key values for a particular entry might be identical to the index key values for the entry. This special case is actually implemented by the previous example of Table 3; where the index key values and the bounding key values are equal to each other, only one pair of values needs to be listed in the page definition table.
Using this method data that most closely matches the search value can be biased toward one portion of the search results. For example, if the LOWER bounding key value is equal to the LOWER index key value and the UPPER bounding key value is much greater than the UPPER index key value, the closest matches will be biased toward the beginning of the returned data.
When returning data, either the bounding keys, the index keys, or both are returned to the client. The client can then submit a new search criterion to the server, based on these values, to browse to a different set of database records as described above. Preferably, returned index key values are used to formulate a new search query.
This enhancement is useful for controlling the data set returned to the client in relation to the key value specified by the client. There are also other advantages. First, the server can control the position of the searched word in the data returned to the client. If it is desired for the key value (or its closest match) to be the first record returned, earlier records can simply be omitted from the returned search results. In other words, the earlier records can be "truncated" off of the results of the second query before returning them to the client. In order to do this, selecting the "previous" records must return those records that were truncated off. So the first request for a previous set of records would probably not return the previous set, but rather the original untruncated first set. This can be achieved with a single flag of state stored by the client and passed back into the server. Second, browsing can be enhanced by overlapping the data sets, so there is some context while viewing the next or previous sets. In compliance with the statute, the invention has been described in language more or less specific as to structural and methodical features. It is to be understood, however, that the invention is not limited to the specific features described, since the means herein disclosed comprise preferred forms of putting the invention into effect.
The invention is, therefore, claimed in any of its forms or modifications within the proper scope of the appended claims appropriately interpreted in accordance with the doctrine of equivalents.

Claims

1. A method of finding data in a database based upon a supplied search value, the database having a plurality of records that can be searched based on key values, the method comprising the following steps: referencing a plurality of page definitions to identify a pair of key values that surround the supplied search value; searching the database to find a set of database records whose key values are bounded by the identified pair of key values.
2. A method as recited in claim 1, further comprising a step of storing the page definitions in a searchable table.
3. A method as recited in claim 1, further comprising a step of storing the page definitions in a searchable table having a plurality of entries, each entry including a pair of key values, each such pair including a lower key value and an upper key value corresponding to a particular set of database records, wherein the upper key value of a given entry is the same as the lower key value of a different entry.
4. A method as recited in claim 1, comprising an additional step of finding additional sets of database records that neighbor the set found in the searching step.
5. A method of finding data in a database based upon a supplied search value, the database having a plurality of records that can be searched based on key values, the method comprising the following steps: referencing a plurality of page definitions to identify one of the page definitions that corresponds to the supplied search value, the identified page definition indicating a first pair of key values; searching the database to find a set of database records whose key values are bounded by said first pair of key values.
6. A method as recited in claim 5, wherein: each page definition additionally indicates a second pair of key values; and said one of the page definitions is identified based upon its second pair of key values surrounding the supplied search value.
7. A method as recited in claim 5, wherein: each page definition additionally specifies a second pair of key values; said one of the page definitions is identified based upon its second pair of key values surrounding the supplied search value; and the second pair of key values spans the first pair of key values.
8. A computer-readable storage medium having instructions for finding data in a database based upon a supplied search value, the database having a plurality of records that can be searched based on key values, the instructions being executable to perform the following steps: storing a page definition table that has a plurality of entries, each entry corresponding to a set of database entries, wherein each entry indicates a respective first pair of key values; searching the page definition table to identify an entry that corresponds to the supplied search value; searching the database to find a set of database records whose key values are bounded by the first pair of key values indicated by the identified entry of the page definition table.
9. A computer-readable storage medium as recited in claim 8, wherein the searching step comprises searching for a table entry whose first pair of key values surrounds the supplied search value.
10. A computer-readable storage medium as recited in claim 8, wherein: the searching step comprises searching for a table entry whose first pair of key values surrounds the supplied search value; each respective first pair of key values comprises a lower key value and an upper key value; and the upper key value indicated by a given entry of the page definition table is the same as the lower key value of a different entry of the page definition table.
11. A computer-readable storage medium as recited in claim 8, wherein: each entry of the page definition table additionally indicates a second pair of key values; the searching step comprises searching for a table entry whose second pair of key values surrounds the search value.
12. A computer-readable storage medium as recited in claim 8, wherein: each entry of the page definition table additionally indicates a second pair of key values; the searching step comprises searching for a table entry whose second pair of key values surrounds the search value; each respective second pair of key values comprises a lower key value and an upper key value; the upper key value indicated by a given entry of the page definition table is the same as the lower key value of a different entry of the page definition table.
13. A computer-readable storage medium as recited in claim 8, containing further instructions for performing an additional step comprising finding additional sets of database records that neighbor the set of database records found in the searching step.
14. In a computer system having a server and a plurality of clients, a method of finding data in a server-maintained database having a plurality of records that can be searched based on key values, the method comprising the following steps: maintaining a plurality of page definitions at the server indicating pairs of index key values and pairs of bounding key values; submitting a search value to the server from a client; identifying one of the page definitions whose index key values surround the search value; identifying a set of database records whose key values are bounded by the bounding key values of the identified one of the page definitions; returning data from the identified set of database records to the client.
15. A method as recited in claim 14, comprising an additional step of identifying and returning additional neighboring sets of database records.
16. A method as recited in claim 14, wherein each page definition indicates bounding key values that are identical to the index key values indicated by the page definition.
17. A method as recited in claim 14, wherein each page definition indicates bounding key values that span the index key values indicated by the page definition.
18. A method as recited in claim 14, further comprising a step of returning the index key values to the client along with the data from the identified set of database records.
19. A method as recited in claim 14, further comprising a step of returning the bounding key values to the client along with the data from the identified set of database records.
20. A method as recited in claim 14, comprising the following additional steps: returning the index key values of the identified one of the page definitions to the client along with the data from the identified set of database records; submitting a new search criteria to the server to browse to a different set of database records, wherein the new search criteria is derived from the returned index values.
21. A method as recited in claim 14, comprising the following additional steps: returning the index key values of the identified one of the page definitions to the client along with the data from the identified set of database records; deriving a new search value based on the returned index key values; submitting the new search value to the server to browse to a different set of database records.
22. A method as recited in claim 14, wherein: each pair of index key values comprises a lower key value and an upper key value; the upper key value indicated by a given page definition is the same as the lower key value of a page definition.
23. An information retrieval system comprising: one or more data storage media; a database maintained on said one or more data storage media, the database having a plurality of records that can be searched based on key values; a page definition table maintained on said one or more data storage media, the page definition table having a plurality of entries indicating pairs of index key values and pairs of bounding key values; wherein each pair of bounding keys defines a set of database records having key values that are bounded by the pair of bounding keys.
24. An information retrieval system as recited in claim 23, wherein the pairs of index key values form an ordered sequence of such pairs, each pair of index key values comprising a lower key value and an upper key value, wherein the lower key value of a particular pair is the same as the upper key value of the subsequent pair.
25. An information retrieval system as recited in claim 23, wherein an individual entry of the page definition table indicates bounding key values that are identical to the index key values indicated by said individual entry.
26. An information retrieval system as recited in claim 23, wherein an individual entry of the page definition table indicates bounding key values that span the index key values indicated by said individual entry.
27. An information retrieval system comprising: a server having access to one or more data storage media; a database maintained on said one or more data storage media, the database having a plurality of records that can be searched based on key values; a page definition table maintained on said one or more data storage media, the page definition table having a plurality of entries indicating pairs of index key values and pairs of bounding key values; a client that submits a search value to the server; the server being responsive to submission of a search value to identify an entry from the page definition table whose index values surround the supplied search value; to identify a set of database records whose key values are bounded by the pair of bounding key values of the identified entry; and to return data from the identified set of database records to the client.
28. An information retrieval system as recited in claim 23, wherein the server is further responsive to submission of the search value to identify additional neighboring sets of database records and to return data from such additional sets of database records to the client.
29. An information retrieval system as recited in claim 27, wherein the pairs of index key values form an ordered sequence of such pairs, each pair of index key values comprising a lower key value and an upper key value, wherein the lower key value of a particular pair is the same as the upper key value of the subsequent pair.
30. An information retrieval system as recited in claim 27, wherein an individual entry of the page definition table indicates bounding key values that are identical to the index key values indicated by said individual entry.
31. An information retrieval system as recited in claim 27, wherein an individual entry of the page definition table indicates bounding key values that span the index key values indicated by said individual entry.
32. An information retrieval system as recited in claim 27, wherein the server returns the index key values to the client along with the data from the identified set of database records.
33. An information retrieval system as recited in claim 27, wherein the server returns the bounding key values to the client along with the data from the defined set of database records.
34. An information retrieval system as recited in claim 27, wherein: the server returns the index key values to the client along with the data from the identified set of database records; the client submits a new search criteria to the server to browse to a different set of database records, wherein the new search criteria is derived from the returned index values.
35. An information retrieval system as recited in claim 27, wherein: the server returns the index key values to the client along with the data from the identified set of database records; the client derives a new search value based on the returned index key values; the client submits the new search value to the server to browse to a different set of database records.
PCT/US1998/011199 1997-06-09 1998-06-01 Database query system and method WO1998057278A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
AU78065/98A AU7806598A (en) 1997-06-09 1998-06-01 Database query system and method
JP50271199A JP4425355B2 (en) 1997-06-09 1998-06-01 Database query system and method
EP98926166A EP1010101A1 (en) 1997-06-09 1998-06-01 Database query system and method

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US08/871,079 US6061677A (en) 1997-06-09 1997-06-09 Database query system and method
US08/871,079 1997-06-09

Publications (1)

Publication Number Publication Date
WO1998057278A1 true WO1998057278A1 (en) 1998-12-17

Family

ID=25356680

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1998/011199 WO1998057278A1 (en) 1997-06-09 1998-06-01 Database query system and method

Country Status (5)

Country Link
US (2) US6061677A (en)
EP (1) EP1010101A1 (en)
JP (2) JP4425355B2 (en)
AU (1) AU7806598A (en)
WO (1) WO1998057278A1 (en)

Families Citing this family (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6513028B1 (en) * 1999-06-25 2003-01-28 International Business Machines Corporation Method, system, and program for searching a list of entries when search criteria is provided for less than all of the fields in an entry
JP4483034B2 (en) * 2000-06-06 2010-06-16 株式会社日立製作所 Heterogeneous data source integrated access method
CA2342578A1 (en) * 2001-03-29 2002-09-29 Ibm Canada Limited-Ibm Canada Limitee Method and apparatus for security of a network server
FI20010724A (en) * 2001-04-06 2002-10-07 Ajatuspaja Oy A method and hardware for accessing a distributed database and an application utilizing it in a terminal-independent manner
US7672945B1 (en) 2002-04-08 2010-03-02 Oracle International Corporation Mechanism for creating member private data in a global namespace
US7136867B1 (en) * 2002-04-08 2006-11-14 Oracle International Corporation Metadata format for hierarchical data storage on a raw storage device
US20040193656A1 (en) * 2003-03-28 2004-09-30 Pizzo Michael J. Systems and methods for caching and invalidating database results and derived objects
US7835953B2 (en) * 2003-09-29 2010-11-16 International Business Machines Corporation Method and structure for monitoring moving objects
US8972380B2 (en) * 2003-09-29 2015-03-03 International Business Machines Corporaton System and method for monitoring events against continual range queries
CA2465558A1 (en) * 2004-04-22 2005-10-22 Ibm Canada Limited - Ibm Canada Limitee Framework for retrieval and display of large result sets
US20070005490A1 (en) * 2004-08-31 2007-01-04 Gopalakrishnan Kumar C Methods and System for Distributed E-commerce
US8370323B2 (en) 2004-08-31 2013-02-05 Intel Corporation Providing information services related to multimodal inputs
US7873911B2 (en) * 2004-08-31 2011-01-18 Gopalakrishnan Kumar C Methods for providing information services related to visual imagery
WO2006036442A2 (en) * 2004-08-31 2006-04-06 Gopalakrishnan Kumar Method and system for providing information services relevant to visual imagery
US20060116983A1 (en) * 2004-11-30 2006-06-01 International Business Machines Corporation System and method for ordering query results
US20070192215A1 (en) * 2006-02-10 2007-08-16 Taylor Thomas B Computer-implemented registration for providing inventory fulfillment services to merchants
US7895175B2 (en) * 2006-11-15 2011-02-22 Yahoo! Inc. Client-side federated search
US7853480B2 (en) * 2007-05-21 2010-12-14 Amazon Technologies, Inc. System and method for providing export services to merchants
US9002886B2 (en) 2013-03-14 2015-04-07 The Neilsen Company (US), LLC Methods and apparatus to search datasets
US9454557B2 (en) * 2013-08-29 2016-09-27 Oracle International Corporation Unit of work based incremental data processing

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0294583A2 (en) * 1987-05-08 1988-12-14 William L. Kuechler System and method for the generalized topological mapping of an information base
US5515531A (en) * 1992-05-29 1996-05-07 Hitachi, Ltd. Parallel database processing system and retrieval method using secondary key

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4956774A (en) * 1988-09-02 1990-09-11 International Business Machines Corporation Data base optimizer using most frequency values statistics
JPH034366A (en) * 1989-05-31 1991-01-10 Toshiba Corp Data base inquiry processing system
JP2969153B2 (en) * 1990-06-29 1999-11-02 カシオ計算機株式会社 Record search method
JPH05314188A (en) * 1992-05-11 1993-11-26 Nec Software Ltd Retrieval processing system for data having range
US5440732A (en) * 1993-02-05 1995-08-08 Digital Equipment Corp., Pat. Law Gr. Key-range locking with index trees
US5485607A (en) * 1993-02-05 1996-01-16 Digital Equipment Corporation Concurrency-control method and apparatus in a database management system utilizing key-valued locking
JP3433803B2 (en) * 1993-07-07 2003-08-04 ヨーロピアン コンピューター − インダストリー リサーチ センター ゲーエムベーハー Database structure
US5542087A (en) * 1993-10-15 1996-07-30 Hewlett-Packard Company Linear hashing for distributed records
US5664172A (en) * 1994-07-19 1997-09-02 Oracle Corporation Range-based query optimizer
US5758353A (en) * 1995-12-01 1998-05-26 Sand Technology Systems International, Inc. Storage and retrieval of ordered sets of keys in a compact 0-complete tree

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0294583A2 (en) * 1987-05-08 1988-12-14 William L. Kuechler System and method for the generalized topological mapping of an information base
US5515531A (en) * 1992-05-29 1996-05-07 Hitachi, Ltd. Parallel database processing system and retrieval method using secondary key

Also Published As

Publication number Publication date
EP1010101A1 (en) 2000-06-21
AU7806598A (en) 1998-12-30
US6061677A (en) 2000-05-09
JP2002511169A (en) 2002-04-09
JP2010044777A (en) 2010-02-25
JP4425355B2 (en) 2010-03-03
US6519587B1 (en) 2003-02-11
JP4542195B2 (en) 2010-09-08

Similar Documents

Publication Publication Date Title
US6061677A (en) Database query system and method
US8224870B1 (en) Data structure for incremental search
US7529739B2 (en) Temporal ranking scheme for desktop searching
US7617197B2 (en) Combined title prefix and full-word content searching
JP4249726B2 (en) Method and system for indexing and searching database groups
CA2511098C (en) Dispersing search engine results by using page category information
US6792414B2 (en) Generalized keyword matching for keyword based searching over relational databases
US6363377B1 (en) Search data processor
US7519595B2 (en) Method and system for adaptive categorial presentation of search results
US9165033B1 (en) Efficient query rewriting
US6801904B2 (en) System for keyword based searching over relational databases
US7152064B2 (en) Searching tool and process for unified search using categories and keywords
US7899844B2 (en) Method and system for access and display of data from large data sets
US6266660B1 (en) Secondary index search
KR20060021858A (en) Heterogeneous indexing for annotation systems
US6513028B1 (en) Method, system, and program for searching a list of entries when search criteria is provided for less than all of the fields in an entry
Wang et al. Concept hierarchy based text database categorization in a metasearch engine environment
Bhattacharjee et al. Efficient query processing for multi-dimensionally clustered tables in DB2
US20070005631A1 (en) Apparatus and method for dynamically determining index split options from monitored database activity
US20080162427A1 (en) Hybrid Relational, Directory, and Content Query Facility
Gupta et al. Comparative Study of Indexing Techniques In DBMS
Borse et al. Inverted Index for Fast Nearest Neighbour
Heidenreich Know your search engine: a comparison of six popular Web search engines
DRTC Organizing the Web: A Faceted Approach
JPS62288934A (en) Massive information retrieving system

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GE GH GM GW HU ID IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW SD SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN ML MR NE SN TD TG

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 1998926166

Country of ref document: EP

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

WWP Wipo information: published in national office

Ref document number: 1998926166

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: CA

WWW Wipo information: withdrawn in national office

Ref document number: 1998926166

Country of ref document: EP