WO2000010106A1 - Mapping information sources - Google Patents

Mapping information sources Download PDF

Info

Publication number
WO2000010106A1
WO2000010106A1 PCT/US1999/018644 US9918644W WO0010106A1 WO 2000010106 A1 WO2000010106 A1 WO 2000010106A1 US 9918644 W US9918644 W US 9918644W WO 0010106 A1 WO0010106 A1 WO 0010106A1
Authority
WO
WIPO (PCT)
Prior art keywords
entity
information
address
world
directory
Prior art date
Application number
PCT/US1999/018644
Other languages
French (fr)
Inventor
Jeffrey Dean Black
Jason Harvey Titus
Ira Joseph Woodhead
Original Assignee
Atlas Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Atlas Corporation filed Critical Atlas Corporation
Priority to AU55659/99A priority Critical patent/AU5565999A/en
Publication of WO2000010106A1 publication Critical patent/WO2000010106A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/954Navigation, e.g. using categorised browsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/957Browsing optimisation, e.g. caching or content distillation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching
    • Y10S707/99934Query formulation, input preparation, or translation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching
    • Y10S707/99935Query augmenting and refining, e.g. inexact access

Definitions

  • mapping information sources Computer-stored information about an entity such as a business may be available from different computerized sources, including a World-Wide Web
  • Web (“Web") accessible sovirce such as a Web site that is under the control of the entity, and a third-party information source that may store public information about the entity.
  • public financial information about a company may be stored in a database that is not linked to the company's Web site or is not directly accessible by Web browser software, such as a database under the control of a financial services firm.
  • the identity of the entity controlling a Web site is not clear or is unknown, particularly to a user of the Web site, and therefore retrieving third-party information about the entity can be difficult or even impossible for the user, which can cause problems for the user, the entity, or others.
  • Much of the information available on Web sites is organized into Web pages that can be retrieved and displayed by Web browser software under the direction of a user.
  • Each of the Web pages is identifiable by a respective Uniform Resource Locator text string ("URL"), such as "http://www.isp321.com/frontpage.html", that the Web browser software can use to select the page.
  • Each URL includes a domain name, such as "isp321.com”, that identifies the Web site where the corresponding Web page is stored for retrieval by Web browser software.
  • Each domain name is registered by an entity that controls the corresponding Web site and Web pages.
  • a domain name registry organization maintains the domain name registration information, which may include name, address, and other information that allows the organization to bill the entity for payment for the maintenance.
  • An Internet service provider is an example of an entity that may have a registered domain name for a Web site.
  • an ISP has customers such as individuals or businesses for whom the ISP stores Web pages on the Web site for retrieval by Web browser software.
  • the ISP may have a customer Maple Street Plumbing for which the ISP stores a home Web page having a URL that includes a prefix "http://www.isp321.com/ ⁇ maplesrplumb”.
  • a home Web page is typically the only or the primary entry point into a Web site or a set of Web pages that are under the control of an entity.
  • Another example of an entity that may have a registered domain name is a Web portal site that maintains, in pages organized by categories, links to Web sites and home pages that are under the control of other entities.
  • a Web portal site allows another entity to create a link from the Web portal site to the other entity's Web site or home page by submitting information to the Web portal site.
  • a method and a system are provided that allow different sets of information from different computerized sources to be mapped to each other to indicate that the different sets pertain to the same entity such as a business.
  • a first set of information is acquired that identifes an entity that is indicated as having control over the use of at least a portion of a World-Wide Web address. It is determined from the first set of information that a second set of information that identifies the entity is included in an entity directory. It is recorded that the use of the at least a portion of the World-Wide Web address is under the control of the entity as identified in the entity directory.
  • a mapping database can be provided that effectively groups together, by entity, information available from Web sites and information available from other sources. Third-party confirmation can be provided regarding the identity of an entity that is indicated as having control over a Web site.
  • Pigs. 1-2, 5, 7, and 9 are block diagrams of computer-based systems.
  • Figs. 10-11 are illustrations of output produced by software.
  • Figs. 3-4, 6 and 8 are flow diagrams of computer-based procedures.
  • Fig. 2 illustrates a mapping system 200, described below, in which domain names and other top-level URLs are mapped to entities such as businesses, in accordance with a procedure 1000, also described below (Fig. 3).
  • a mapping system may be used to produce a computer system 10 (Fig. 1) in which a mapping database 12 maps URLs or domain names 14 to entities 16 such as people, businesses, or government agencies, as described in more detail below.
  • the mapping database may indicate that any URL that begins with "http://www.uspto.gov" is for a Web page controlled by the U.S.
  • mapping database may use a unique identification number ("unique ID"), such as a 9-digit American Business Information (“ABI”) number, to identify an entity so that other information about the entity can be retrieved from sources outside the mapping database by searching under the unique ID. (ABI numbers are sponsored by infoUSA.)
  • unique ID such as a 9-digit American Business Information (“ABI") number
  • URL 202 is acquired from a URL source 204 such as a domain name registry database (step 1010).
  • Raw entity information 205 is acquired from a URL guide 206 (which may be the same domain name registry database) identifying an entity that is indicated as having control over the top-level URL (step 1020).
  • the raw entity information is refined to produce refined entity information 208 that better conforms to a set of format standards such as standards described below (step 1030).
  • the refined entity information is submitted to a matching application 210 that attempts to identify, in an entity directory 212, an entity that matches the refined entity information ("matched entity”) (step 1040). If a matched entity is found, an entry is added to a mapping database 214 to associate the top-level URL with the matched entity (step 1050).
  • procedure 1000 may vary, depending on the natures of the top-level URL and the URL source involved.
  • One implementation takes the form of procedure 2000 (Fig. 4) that is for cases in which a domain name 300 (Fig. 5) serves as the top-level URL and a domain name registry 302 serves as the URL source.
  • the domain name such as "elmstpets.com”
  • the query may take the form of a request for database records that match a search such as a search for all ".com" domain names.
  • Raw entity information 304 such as "Elm Street Pets, Inc., 10 Elm St, Elmtown, Elmstate 00000" is acquired from the domain name registry by submitting a "whois" request to the domain name registry (step 2020).
  • the domain name registry directs the domain name registry to disclose the billing name and address information that the domain name registry received at the time the domain name was initially registered with the domain name registry.
  • the domain name registry returns the contents of a record with one name field and one address field (usually including new-line characters) among other information. The contents of the name and address fields are taken.
  • the raw entity information is refined to produce refined entity information 306 that conforms to a set of format standards (step 2030).
  • the set of format standards may include a postal standard for addresses.
  • the name of the state in the raw entity information may be changed to a standard two-letter postal abbreviation (e.g., "Elmstate” to "ES").
  • the address field is parsed to retrieve street, city, state, and zip code information. If necessary, the name and street information is truncated to thirty characters. Further changes made for conformity with naming conventions may include, for example, changes from "INCORPORATED” to "INC”, from "COMMUNICATION” to "COMM”, and from "McDonald” to "MC DONALD".
  • DSF Delivery Sequence File
  • NCOA National Change of Address
  • the refined entity information is produced to increase the chance that a matching application as described below will produce a useful match.
  • the refined entity information is submitted to a matching application 308 such as an infoUSA or Groupl matching engine (available from Groupl Software, 4200 Parliament Place, Suite 600, Lanham MD 20706-1844) that attempts to identify, in an entity directory 310 such as an ABI database, an entity that matches the refined entity information ("matched entity") (step 2040).
  • the matching engine cleans and standardizes address information (including by adding missing address information, standardizing city names and two-character state abbreviations, and correcting misspelled address elements) and then compares the address information to determine whether a match can be found.
  • the ABI database includes name and address information for entities such as businesses.
  • an entry is added to a mapping database 312 to associate the domain name with the matched entity (step 2050).
  • the association in the entry may be accomplished by retrieving an ABI number for the matched entity from the ABI database and including the ABI number together with the domain name in the entry.
  • a procedure 3000 (Fig. 6) may be used where a hosting entity 400 (Fig. 7) hosts the Web site on behalf of the entity that has control over a top-level URL 402.
  • the hosting entity may be a pet services clearinghouse that has control over a domain name "elmstpetservices.com”
  • the top-level URL may be "http://www.elmstpetservices.com/ ⁇ petdoctor” and may be under the control of a veterinary service that is a service subscriber, i.e., that rents Web site space from the pet services clearinghouse.
  • the top-level URL is acquired from the hosting entity (step 3010).
  • the top-level URL may be acquired by requesting a report of hosted Web sites from a directory service provided by the hosting entity.
  • Raw entity information 404 for the top-level URL is acquired from the hosting entity (step 3020).
  • the raw entity information may include subscriber information such as name and address information that allows the hosting entity to bill the entity.
  • the raw entity information is refined to produce refined entity information 406 that better conforms to a set of format standards (step 3030).
  • a set of format standards As stated above, postal standards may be imposed.
  • the refined entity information is submitted to a matching application 408 that attempts to identify, in an entity directory 410, an entity that matches the refined entity information ("matched entity") (step 3040). If a matched entity is found, an entry is added to a mapping database 412 to associate the top-level URL with the matched entity (step 3050). An ABI number may be retrieved and used for such association as described above.
  • a search-oriented implementation, procedure 4000 may be used where billing information about the entity is not available.
  • a top-level URL 500 (Fig. 9) is acquired from a URL source 502 (step 4010).
  • Web pages at the Web site 503 identified by the top-level URL are searched for raw entity information 504 about the entity that has control over the top-level URL (step 4020).
  • the raw entity information may be found on one or more of the Web pages in the form of contact information. For example, if the top level URL is "http://www. isp456.com/ ⁇ elmstcats" and one of the Web pages at the Web site includes the text "For more information, please contact Elm St. Pets" followed by an address and telephone number, the words “Elm St. Pets" and the address and telephone number may be interpreted as raw entity information.
  • an address parser searches the top level page at-the Web site for contact information. The address parser may search first for numbers arranged in a zip code or telephone number format, and if a zip code or telephone number is found, may search the immediately surrounding text for name and address information.
  • the raw entity information is refined to produce refined entity information 506 that better conforms to a set of format standards, which may include postal standards as described above (step 4030).
  • the refined entity information is submitted to a matching application 508 that attempts to identify, in an entity directory 510, an entity that matches the refined entity information ("matched entity") (step 4040). If a matched entity is found, an entry is added to a mapping database 512 to associate the top-level URL with the matched entity (step 4050).
  • a filtering process may be executed before the entry is added to the mapping database. In the filtering process, a check for actual name and address similarity is performed.
  • the filtering process checks to determine whether the names are very dissimilar, because in some cases a match may be indicated by the matching application even if such name dissimilarity exists. For example, the matching application may not always correctly identify an entity that has occupied part of building that is listed as being fully occupied by another entity such as the building's owner.
  • the filtering process may also include checking whether, in the results produced by the matching application, the zip code matches the county or metropolitan area indicated elsewhere in the results.
  • Procedure 3000 may also include a scoring process before the entry is added to the mapping database.
  • the scoring process the contents of each field in the name and address in the matching entity information returned by the matching application is given a similarity score indicating the similarity of the contents to the corresponding contents of the refined entity information.
  • An overall score such as an average is calculated based on the similarity scores for the fields, and the overall score is tested against a threshold to determine whether the matching entity information returned by the matching application is acceptable.
  • a highest score is assigned to a perfectly matching telephone number, and lower scores are assigned for other types of matches such as zip code and address matches.
  • an overall score of at least 90 may indicate a reliably correct match
  • an overall score that is less than 90 but is at least 80 may indicate that manual spot checking is necessary
  • an overall score that is less than 80 but is at least 70 may indicate that full manual checking is necessary
  • an overall score that is less than 70 may indicate that the purported match is not worth considering.
  • the scoring process may be used to compare the raw, refined, or matching entity information to existing entries in the mapping database, to determine whether an entry already exists for the entity, perhaps in connection with a different top-level URL. For example, the matching entity information returned by the matching application may be compared, by the scoring process, to entries in the mapping database, and the entry having the highest overall score that is above the threshold may be deemed an already-existing entry for the entity.
  • the scoring process may include additional refining or reformatting of the information involved, including the removal of spaces and punctuation.
  • Five examples of the results of executions of procedure 2000 are shown in
  • the unique IDs provided in the mapping database may be used to search an information source outside the entity information database to produce a subset of the mapping database that has records only for entities having a particular characteristic, such as a particular geographic location or between 1000 and 5000 employees.
  • each of the entities may be assigned different unique IDs, and the different unique IDs may be linked in the mapping database to note the relationship among the entities. For example, a company that has offices in different locations may be assigned a unique ID for the company itself and a respective different unique ID for each location. In another example, when two previously unrelated companies merge or one is acquired by the other, each may retain its unique ID and a new, different unique ID may be assigned to the combination of the two companies, or both companies may be assigned the same unique ID.
  • the raw entity information may be acquired in other ways such as from the entity itself.
  • the entity may submit raw entity information in an on-line questionnaire.
  • mapping database and applications based on the mapping database may take advantage of a hierarchical organization of Web pages, by treating similarly a top-level URL page and all pages below the top-level URL page, such as pages sharing a particular prefix with the top-level URL page. For example, all pages sharing the prefix "http://www.isp321.com" may be treated as being under the control of an ISP named Global ISP Co.
  • the mapping database may map an entity to Web pages maintained at different Web sites. In such a case, the entry in the mapping database associates the entity with all the sites.
  • One or more of the databases and directories referenced above may be or include a relational database and may have records to which fields may be added readily.
  • any of many different types of computer equipment may be used.
  • one or more Intel-based personal computers may be used that run an SQL database on Linux and one or more programs written in Perl or the C programming language with interfaces to the SQL database.
  • the technique may be implemented in hardware or software, or a combination of both.
  • the technique is implemented in computer programs executing on one or more programmable computers, such as a personal computer running or able to run an operating system such as Unix, Linux, Microsoft Windows 95, 98, or NT, or Macintosh OS, that each include a processor, a storage medium readable by the processor (including volatile and non-volatile memory and/or storage elements), at least one input device such as a keyboard, and at least one output device.
  • Program code is applied to data entered using the input device to perform the technique described above and to generate output information.
  • the output information is applied to one or more output devices such as a display screen of the computer.
  • each program is implemented in a high level procedural or object-oriented programming language such as Perl, C, C++, or Java to communicate with a computer system.
  • the programs can be implemented in assembly or machine language, if desired. In any case, the language may be a compiled or interpreted language.
  • each such computer program is stored on a storage medium or device, such as ROM or optical or magnetic disc, that is readable by a general or special purpose programmable computer for configuring and operating the computer when the storage medium or device is read by the computer to perform the procedures described in this document.
  • the system may also be considered to be implemented as a computer-readable storage medium, configured with a computer program, where the storage medium so configured causes a computer to operate in a specific and predefined manner.
  • the user may be a human being or a non-human entity such as a computer program or an automated device that may interact with one or more of the databases or one or more of the applications via an application programming interface ("API") or a network message.
  • API application programming interface
  • An on-line information store or multiple databases may serve as the entity directory, which may take the-form of any mechanism that provides automated access to information, such as a spreadsheet file or a store of email messages.

Abstract

A first set of information is acquired that identifies an entity that is indicated as having control over the use of at least a portion of a World-Wide Web address. It is determined from the first set of information that a second set of information that identifies the entity is included in an entity directory. It is recorded that the use of the at least a portion of the World-Wide Web address is under the control of the entity as identified in the entity directory.

Description

MAPPING INFORMATION SOURCES
Cross-Reference to Related Applications This application claims the benefit of United States Provisional Application Serial No. 60/097029 entitled "Collecting, Combining, Analyzing, and Using Internet and Business Information" filed on August 17, 1998, which is incorporated herein.
Background of the Invention This application relates to mapping information sources. Computer-stored information about an entity such as a business may be available from different computerized sources, including a World-Wide Web
("Web") accessible sovirce such as a Web site that is under the control of the entity, and a third-party information source that may store public information about the entity. For example, public financial information about a company may be stored in a database that is not linked to the company's Web site or is not directly accessible by Web browser software, such as a database under the control of a financial services firm. In many instances, the identity of the entity controlling a Web site is not clear or is unknown, particularly to a user of the Web site, and therefore retrieving third-party information about the entity can be difficult or even impossible for the user, which can cause problems for the user, the entity, or others. Much of the information available on Web sites is organized into Web pages that can be retrieved and displayed by Web browser software under the direction of a user. Each of the Web pages is identifiable by a respective Uniform Resource Locator text string ("URL"), such as "http://www.isp321.com/frontpage.html", that the Web browser software can use to select the page. Each URL includes a domain name, such as "isp321.com", that identifies the Web site where the corresponding Web page is stored for retrieval by Web browser software. Each domain name is registered by an entity that controls the corresponding Web site and Web pages. A domain name registry organization maintains the domain name registration information, which may include name, address, and other information that allows the organization to bill the entity for payment for the maintenance. (It is to be understood that the term "registry", as used herein, also refers to a domain name registrar or any other entity that may provide assistance in registering a domain name.) An Internet service provider ("ISP") is an example of an entity that may have a registered domain name for a Web site. Typically, an ISP has customers such as individuals or businesses for whom the ISP stores Web pages on the Web site for retrieval by Web browser software. For example, the ISP may have a customer Maple Street Plumbing for which the ISP stores a home Web page having a URL that includes a prefix "http://www.isp321.com/~maplesrplumb". A home Web page is typically the only or the primary entry point into a Web site or a set of Web pages that are under the control of an entity.
Another example of an entity that may have a registered domain name is a Web portal site that maintains, in pages organized by categories, links to Web sites and home pages that are under the control of other entities. Typically, a Web portal site allows another entity to create a link from the Web portal site to the other entity's Web site or home page by submitting information to the Web portal site.
Summary of the Invention
A method and a system are provided that allow different sets of information from different computerized sources to be mapped to each other to indicate that the different sets pertain to the same entity such as a business. A first set of information is acquired that identifes an entity that is indicated as having control over the use of at least a portion of a World-Wide Web address. It is determined from the first set of information that a second set of information that identifies the entity is included in an entity directory. It is recorded that the use of the at least a portion of the World-Wide Web address is under the control of the entity as identified in the entity directory. Different aspects of the invention allow one or more of the following. A mapping database can be provided that effectively groups together, by entity, information available from Web sites and information available from other sources. Third-party confirmation can be provided regarding the identity of an entity that is indicated as having control over a Web site.
Other features and advantages will become apparent from the following description, including the drawings, and from the claims.
Brief Description of the Drawings Pigs. 1-2, 5, 7, and 9 are block diagrams of computer-based systems.
Figs. 10-11 are illustrations of output produced by software. Figs. 3-4, 6 and 8 are flow diagrams of computer-based procedures.
Detailed Description Fig. 2 illustrates a mapping system 200, described below, in which domain names and other top-level URLs are mapped to entities such as businesses, in accordance with a procedure 1000, also described below (Fig. 3). Such a mapping system may be used to produce a computer system 10 (Fig. 1) in which a mapping database 12 maps URLs or domain names 14 to entities 16 such as people, businesses, or government agencies, as described in more detail below. For example, the mapping database may indicate that any URL that begins with "http://www.uspto.gov" is for a Web page controlled by the U.S. Patent and Trademark Office, or that domain names "elmstdogs.com" and "elmstcats.com" are under the control of a company named Elm Street Pets, Inc. The mapping database may use a unique identification number ("unique ID"), such as a 9-digit American Business Information ("ABI") number, to identify an entity so that other information about the entity can be retrieved from sources outside the mapping database by searching under the unique ID. (ABI numbers are sponsored by infoUSA.) With reference to Figs. 2-3, procedure 1000 is now described. A top-level
URL 202 is acquired from a URL source 204 such as a domain name registry database (step 1010). Raw entity information 205 is acquired from a URL guide 206 (which may be the same domain name registry database) identifying an entity that is indicated as having control over the top-level URL (step 1020). The raw entity information is refined to produce refined entity information 208 that better conforms to a set of format standards such as standards described below (step 1030). The refined entity information is submitted to a matching application 210 that attempts to identify, in an entity directory 212, an entity that matches the refined entity information ("matched entity") (step 1040). If a matched entity is found, an entry is added to a mapping database 214 to associate the top-level URL with the matched entity (step 1050). If the entity directory includes a unique identification ("ID") number for the matched entity, the association in the entry may be accomplished retrieving the unique ID for the matched entity and including the unique ID together with the top-level URL in the entry. Specific implementations of procedure 1000 may vary, depending on the natures of the top-level URL and the URL source involved. One implementation takes the form of procedure 2000 (Fig. 4) that is for cases in which a domain name 300 (Fig. 5) serves as the top-level URL and a domain name registry 302 serves as the URL source. The domain name, such as "elmstpets.com", is acquired from the domain name registry by submitting a query to the domain name registry (step 2010). The query may take the form of a request for database records that match a search such as a search for all ".com" domain names.
Raw entity information 304, such as "Elm Street Pets, Inc., 10 Elm St, Elmtown, Elmstate 00000" is acquired from the domain name registry by submitting a "whois" request to the domain name registry (step 2020). The
"whois" request directs the domain name registry to disclose the billing name and address information that the domain name registry received at the time the domain name was initially registered with the domain name registry. The domain name registry returns the contents of a record with one name field and one address field (usually including new-line characters) among other information. The contents of the name and address fields are taken.
The raw entity information is refined to produce refined entity information 306 that conforms to a set of format standards (step 2030). The set of format standards may include a postal standard for addresses. For example, the name of the state in the raw entity information may be changed to a standard two-letter postal abbreviation (e.g., "Elmstate" to "ES"). In particular, the address field is parsed to retrieve street, city, state, and zip code information. If necessary, the name and street information is truncated to thirty characters. Further changes made for conformity with naming conventions may include, for example, changes from "INCORPORATED" to "INC", from "COMMUNICATION" to "COMM", and from "McDonald" to "MC DONALD". The results are submitted to a postal oriented application known as Delivery Sequence File ("DSF"), which may result in further name and address standardization. A National Change of Address ("NCOA") process is then applied so that an address change due to a move is taken into account in the refined entity information. DSF and NCOA are U.S. Postal Service applications. DSF performs address processing services such as correcting zip codes and expanding zip codes from five digit format to nine digit format. NCOA matches names and addresses to changes of address filed by relocating postal customers; when a match is found for a name and address that has been submitted, a new address is returned.
The refined entity information is produced to increase the chance that a matching application as described below will produce a useful match. The refined entity information is submitted to a matching application 308 such as an infoUSA or Groupl matching engine (available from Groupl Software, 4200 Parliament Place, Suite 600, Lanham MD 20706-1844) that attempts to identify, in an entity directory 310 such as an ABI database, an entity that matches the refined entity information ("matched entity") (step 2040). The matching engine cleans and standardizes address information (including by adding missing address information, standardizing city names and two-character state abbreviations, and correcting misspelled address elements) and then compares the address information to determine whether a match can be found. The ABI database includes name and address information for entities such as businesses. If a matched entity is found, an entry is added to a mapping database 312 to associate the domain name with the matched entity (step 2050). The association in the entry may be accomplished by retrieving an ABI number for the matched entity from the ABI database and including the ABI number together with the domain name in the entry. A procedure 3000 (Fig. 6) may be used where a hosting entity 400 (Fig. 7) hosts the Web site on behalf of the entity that has control over a top-level URL 402. For example, the hosting entity may be a pet services clearinghouse that has control over a domain name "elmstpetservices.com", and the top-level URL may be "http://www.elmstpetservices.com/~petdoctor" and may be under the control of a veterinary service that is a service subscriber, i.e., that rents Web site space from the pet services clearinghouse. The top-level URL is acquired from the hosting entity (step 3010). The top-level URL may be acquired by requesting a report of hosted Web sites from a directory service provided by the hosting entity. Raw entity information 404 for the top-level URL is acquired from the hosting entity (step 3020). The raw entity information may include subscriber information such as name and address information that allows the hosting entity to bill the entity.
The raw entity information is refined to produce refined entity information 406 that better conforms to a set of format standards (step 3030). As stated above, postal standards may be imposed.
The refined entity information is submitted to a matching application 408 that attempts to identify, in an entity directory 410, an entity that matches the refined entity information ("matched entity") (step 3040). If a matched entity is found, an entry is added to a mapping database 412 to associate the top-level URL with the matched entity (step 3050). An ABI number may be retrieved and used for such association as described above.
A search-oriented implementation, procedure 4000 (Fig. 8), may be used where billing information about the entity is not available. A top-level URL 500 (Fig. 9) is acquired from a URL source 502 (step 4010).
Web pages at the Web site 503 identified by the top-level URL are searched for raw entity information 504 about the entity that has control over the top-level URL (step 4020). The raw entity information may be found on one or more of the Web pages in the form of contact information. For example, if the top level URL is "http://www. isp456.com/~elmstcats" and one of the Web pages at the Web site includes the text "For more information, please contact Elm St. Pets" followed by an address and telephone number, the words "Elm St. Pets" and the address and telephone number may be interpreted as raw entity information. In a specific implementation, an address parser searches the top level page at-the Web site for contact information. The address parser may search first for numbers arranged in a zip code or telephone number format, and if a zip code or telephone number is found, may search the immediately surrounding text for name and address information.
The raw entity information is refined to produce refined entity information 506 that better conforms to a set of format standards, which may include postal standards as described above (step 4030). The refined entity information is submitted to a matching application 508 that attempts to identify, in an entity directory 510, an entity that matches the refined entity information ("matched entity") (step 4040). If a matched entity is found, an entry is added to a mapping database 512 to associate the top-level URL with the matched entity (step 4050). In one or more of the procedures 1000, 2000, 3000, and 4000 described above, a filtering process may be executed before the entry is added to the mapping database. In the filtering process, a check for actual name and address similarity is performed. Such a check may be particularly desirable in a case in which the matching application is prone to mistakes in indicating matches. In particular, the filtering process checks to determine whether the names are very dissimilar, because in some cases a match may be indicated by the matching application even if such name dissimilarity exists. For example, the matching application may not always correctly identify an entity that has occupied part of building that is listed as being fully occupied by another entity such as the building's owner. The filtering process may also include checking whether, in the results produced by the matching application, the zip code matches the county or metropolitan area indicated elsewhere in the results.
Procedure 3000 (or another one of the procedures described above) may also include a scoring process before the entry is added to the mapping database. In the scoring process, the contents of each field in the name and address in the matching entity information returned by the matching application is given a similarity score indicating the similarity of the contents to the corresponding contents of the refined entity information. An overall score such as an average is calculated based on the similarity scores for the fields, and the overall score is tested against a threshold to determine whether the matching entity information returned by the matching application is acceptable. In a specific implementation, a highest score is assigned to a perfectly matching telephone number, and lower scores are assigned for other types of matches such as zip code and address matches. In an example scoring scale of 0-100, an overall score of at least 90 may indicate a reliably correct match, an overall score that is less than 90 but is at least 80 may indicate that manual spot checking is necessary, an overall score that is less than 80 but is at least 70 may indicate that full manual checking is necessary, and an overall score that is less than 70 may indicate that the purported match is not worth considering.
The scoring process may be used to compare the raw, refined, or matching entity information to existing entries in the mapping database, to determine whether an entry already exists for the entity, perhaps in connection with a different top-level URL. For example, the matching entity information returned by the matching application may be compared, by the scoring process, to entries in the mapping database, and the entry having the highest overall score that is above the threshold may be deemed an already-existing entry for the entity.
The scoring process may include additional refining or reformatting of the information involved, including the removal of spaces and punctuation. Five examples of the results of executions of procedure 2000 are shown in
Fig. 10, and four examples of the results of executions of procedure 3000 are shown in Fig. 11.
The unique IDs provided in the mapping database may be used to search an information source outside the entity information database to produce a subset of the mapping database that has records only for entities having a particular characteristic, such as a particular geographic location or between 1000 and 5000 employees.
Where an entity constitutes a portion of another entity, each of the entities may be assigned different unique IDs, and the different unique IDs may be linked in the mapping database to note the relationship among the entities. For example, a company that has offices in different locations may be assigned a unique ID for the company itself and a respective different unique ID for each location. In another example, when two previously unrelated companies merge or one is acquired by the other, each may retain its unique ID and a new, different unique ID may be assigned to the combination of the two companies, or both companies may be assigned the same unique ID.
The raw entity information may be acquired in other ways such as from the entity itself. The entity may submit raw entity information in an on-line questionnaire.
The mapping database and applications based on the mapping database may take advantage of a hierarchical organization of Web pages, by treating similarly a top-level URL page and all pages below the top-level URL page, such as pages sharing a particular prefix with the top-level URL page. For example, all pages sharing the prefix "http://www.isp321.com" may be treated as being under the control of an ISP named Global ISP Co.
The mapping database may map an entity to Web pages maintained at different Web sites. In such a case, the entry in the mapping database associates the entity with all the sites. One or more of the databases and directories referenced above may be or include a relational database and may have records to which fields may be added readily.
Any of many different types of computer equipment may be used. For example, one or more Intel-based personal computers may be used that run an SQL database on Linux and one or more programs written in Perl or the C programming language with interfaces to the SQL database.
The technique (i.e., the procedures described above) may be implemented in hardware or software, or a combination of both. In at least some cases, it is advantageous if the technique is implemented in computer programs executing on one or more programmable computers, such as a personal computer running or able to run an operating system such as Unix, Linux, Microsoft Windows 95, 98, or NT, or Macintosh OS, that each include a processor, a storage medium readable by the processor (including volatile and non-volatile memory and/or storage elements), at least one input device such as a keyboard, and at least one output device. Program code is applied to data entered using the input device to perform the technique described above and to generate output information. The output information is applied to one or more output devices such as a display screen of the computer. In at least some cases, it is advantageous if each program is implemented in a high level procedural or object-oriented programming language such as Perl, C, C++, or Java to communicate with a computer system. However, the programs can be implemented in assembly or machine language, if desired. In any case, the language may be a compiled or interpreted language. In at least some cases, it is advantageous if each such computer program is stored on a storage medium or device, such as ROM or optical or magnetic disc, that is readable by a general or special purpose programmable computer for configuring and operating the computer when the storage medium or device is read by the computer to perform the procedures described in this document. The system may also be considered to be implemented as a computer-readable storage medium, configured with a computer program, where the storage medium so configured causes a computer to operate in a specific and predefined manner. Other embodiments are within the scope of the following claims. For example, the user may be a human being or a non-human entity such as a computer program or an automated device that may interact with one or more of the databases or one or more of the applications via an application programming interface ("API") or a network message. An on-line information store or multiple databases may serve as the entity directory, which may take the-form of any mechanism that provides automated access to information, such as a spreadsheet file or a store of email messages.

Claims

What is claimed is:Claims
1. A method comprising: acquiring a first set of information that identifies an entity that is indicated as having control over the use of at least a portion of a World-Wide Web address; determining from the first set of information that a second set of information that identifies the entity is included in an entity directory; and recording that the use of at least a portion of the World-Wide Web address is under the control of the entity as identified in the entity directory.
2. The method of claim 1, further comprising: deriving first address information from the first set of information; deriving second address information from the second set of information; and comparing the first address information to the second address information.
3. A system comprising: an acquirer that acquires a first set of information that identifies an entity that is indicated as having control over the use of at least a portion of a World- Wide Web address; a determiner that determines from the first set of information that a second set of information that identifies the entity is included in an entity directory; and a recorder that records that the use of at least a portion of the World-Wide Web address is under the control of the entity as identified in the entity directory.
4. Computer software, residing on a computer-readable storage medium, comprising a set of instructions for use in a computer system to cause the computer system to: acquire a first set of information that identifies an entity that is indicated as having control over the use of at least a portion of a World-Wide Web address; determine from the first set of information that a second set of information that identifies the entity is included in an entity directory; and record that the use of at least a portion of the World-Wide Web address is under the control of the entity as identified in the entity directory.
5. A data processing system comprising: a computer; a storage device for storing data on a storage medium; a first logic system configured to acquire a first set of information that identifies an entity that is indicated as having control over the use of at least a portion of a World-Wide Web address; a second logic system configured to determine from the first set of information that a second set of information that identifies the entity is included in an entity directory; and a third logic system configured to record that the use of at least a portion of the World-Wide Web address is under the control of the entity as identified in the entity directory.
PCT/US1999/018644 1998-08-17 1999-08-16 Mapping information sources WO2000010106A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU55659/99A AU5565999A (en) 1998-08-17 1999-08-16 Mapping information sources

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US9702998P 1998-08-17 1998-08-17
US60/097,029 1998-08-17

Publications (1)

Publication Number Publication Date
WO2000010106A1 true WO2000010106A1 (en) 2000-02-24

Family

ID=22260433

Family Applications (4)

Application Number Title Priority Date Filing Date
PCT/US1999/018646 WO2000010108A1 (en) 1998-08-17 1999-08-16 Dynamically categorizing entity information
PCT/US1999/018643 WO2000010105A1 (en) 1998-08-17 1999-08-16 Enhancing computer-based searching
PCT/US1999/018644 WO2000010106A1 (en) 1998-08-17 1999-08-16 Mapping information sources
PCT/US1999/018645 WO2000010107A1 (en) 1998-08-17 1999-08-16 Analyzing internet-based information

Family Applications Before (2)

Application Number Title Priority Date Filing Date
PCT/US1999/018646 WO2000010108A1 (en) 1998-08-17 1999-08-16 Dynamically categorizing entity information
PCT/US1999/018643 WO2000010105A1 (en) 1998-08-17 1999-08-16 Enhancing computer-based searching

Family Applications After (1)

Application Number Title Priority Date Filing Date
PCT/US1999/018645 WO2000010107A1 (en) 1998-08-17 1999-08-16 Analyzing internet-based information

Country Status (5)

Country Link
US (2) US6654813B1 (en)
EP (1) EP1105818A1 (en)
JP (2) JP2002522847A (en)
AU (4) AU5566199A (en)
WO (4) WO2000010108A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6735585B1 (en) 1998-08-17 2004-05-11 Altavista Company Method for search engine generating supplemented search not included in conventional search result identifying entity data related to portion of located web page

Families Citing this family (51)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6654813B1 (en) * 1998-08-17 2003-11-25 Alta Vista Company Dynamically categorizing entity information
JP4460693B2 (en) * 1999-10-26 2010-05-12 富士通株式会社 Network system with information retrieval function
US8271316B2 (en) 1999-12-17 2012-09-18 Buzzmetrics Ltd Consumer to business data capturing system
CA2298194A1 (en) * 2000-02-07 2001-08-07 Profilium Inc. Method and system for delivering and targeting advertisements over wireless networks
WO2001069445A2 (en) * 2000-03-14 2001-09-20 Sony Electronics, Inc. A method and device for forming a semantic description
US6968380B1 (en) 2000-05-30 2005-11-22 International Business Machines Corporation Method and system for increasing ease-of-use and bandwidth utilization in wireless devices
US6985933B1 (en) * 2000-05-30 2006-01-10 International Business Machines Corporation Method and system for increasing ease-of-use and bandwidth utilization in wireless devices
US7747713B1 (en) * 2000-06-30 2010-06-29 Hitwise Pty. Ltd. Method and system for classifying information available on a computer network
US6983379B1 (en) 2000-06-30 2006-01-03 Hitwise Pty. Ltd. Method and system for monitoring online behavior at a remote site and creating online behavior profiles
WO2002003219A1 (en) * 2000-06-30 2002-01-10 Plurimus Corporation Method and system for monitoring online computer network behavior and creating online behavior profiles
WO2002008940A2 (en) * 2000-07-20 2002-01-31 Johnson Rodney D Information archival and retrieval system for internetworked computers
NL1016379C2 (en) * 2000-07-25 2002-01-28 Alphonsus Albertus Schirris Information searching method for e.g. internet, uses synonyms or translations of inputted search terms
AUPQ920300A0 (en) * 2000-08-04 2000-08-31 Sharinga Networks Inc. Network address resolution
JP2002084561A (en) * 2000-09-06 2002-03-22 Nec Corp Connection system, connection method therefor, and recording medium in which connection program is recorded
JP4200645B2 (en) * 2000-09-08 2008-12-24 日本電気株式会社 Information processing apparatus, information processing method, and recording medium
US7185065B1 (en) 2000-10-11 2007-02-27 Buzzmetrics Ltd System and method for scoring electronic messages
US7197470B1 (en) 2000-10-11 2007-03-27 Buzzmetrics, Ltd. System and method for collection analysis of electronic discussion methods
US7080101B1 (en) * 2000-12-01 2006-07-18 Ncr Corp. Method and apparatus for partitioning data for storage in a database
US20030061232A1 (en) * 2001-09-21 2003-03-27 Dun & Bradstreet Inc. Method and system for processing business data
US6763362B2 (en) * 2001-11-30 2004-07-13 Micron Technology, Inc. Method and system for updating a search engine
US7254573B2 (en) * 2002-10-02 2007-08-07 Burke Thomas R System and method for identifying alternate contact information in a database related to entity, query by identifying contact information of a different type than was in query which is related to the same entity
US7792828B2 (en) 2003-06-25 2010-09-07 Jericho Systems Corporation Method and system for selecting content items to be presented to a viewer
US7756750B2 (en) 2003-09-02 2010-07-13 Vinimaya, Inc. Method and system for providing online procurement between a buyer and suppliers over a network
NO20035563D0 (en) * 2003-10-01 2003-12-12 Telenor Asa Method and system for obtaining improved subscriber information
US7725414B2 (en) 2004-03-16 2010-05-25 Buzzmetrics, Ltd An Israel Corporation Method for developing a classifier for classifying communications
US7536382B2 (en) * 2004-03-31 2009-05-19 Google Inc. Query rewriting with entity detection
US20060015401A1 (en) * 2004-07-15 2006-01-19 Chu Barry H Efficiently spaced and used advertising in network-served multimedia documents
US7523085B2 (en) 2004-09-30 2009-04-21 Buzzmetrics, Ltd An Israel Corporation Topical sentiments in electronically stored communications
US9158855B2 (en) 2005-06-16 2015-10-13 Buzzmetrics, Ltd Extracting structured data from weblogs
US20070100960A1 (en) * 2005-10-28 2007-05-03 Yahoo! Inc. Managing content for RSS alerts over a network
US20070100836A1 (en) * 2005-10-28 2007-05-03 Yahoo! Inc. User interface for providing third party content as an RSS feed
US20090094137A1 (en) * 2005-12-22 2009-04-09 Toppenberg Larry W Web Page Optimization Systems
US7624101B2 (en) 2006-01-31 2009-11-24 Google Inc. Enhanced search results
US7660783B2 (en) 2006-09-27 2010-02-09 Buzzmetrics, Inc. System and method of ad-hoc analysis of data
US20080313142A1 (en) * 2007-06-14 2008-12-18 Microsoft Corporation Categorization of queries
US9392074B2 (en) 2007-07-07 2016-07-12 Qualcomm Incorporated User profile generation architecture for mobile content-message targeting
US9497286B2 (en) 2007-07-07 2016-11-15 Qualcomm Incorporated Method and system for providing targeted information based on a user profile in a mobile environment
US9203911B2 (en) 2007-11-14 2015-12-01 Qualcomm Incorporated Method and system for using a cache miss state match indicator to determine user suitability of targeted content messages in a mobile environment
US9391789B2 (en) 2007-12-14 2016-07-12 Qualcomm Incorporated Method and system for multi-level distribution information cache management in a mobile environment
US8347326B2 (en) 2007-12-18 2013-01-01 The Nielsen Company (US) Identifying key media events and modeling causal relationships between key events and reported feelings
KR100930617B1 (en) 2008-04-08 2009-12-09 한국과학기술정보연구원 Multiple object-oriented integrated search system and method
US20090327223A1 (en) * 2008-06-26 2009-12-31 Microsoft Corporation Query-driven web portals
US9607324B1 (en) 2009-01-23 2017-03-28 Zakta, LLC Topical trust network
US10007729B1 (en) 2009-01-23 2018-06-26 Zakta, LLC Collaboratively finding, organizing and/or accessing information
US10191982B1 (en) * 2009-01-23 2019-01-29 Zakata, LLC Topical search portal
US8874727B2 (en) 2010-05-31 2014-10-28 The Nielsen Company (Us), Llc Methods, apparatus, and articles of manufacture to rank users in an online social network
US8484186B1 (en) 2010-11-12 2013-07-09 Consumerinfo.Com, Inc. Personalized people finder
US10068266B2 (en) 2010-12-02 2018-09-04 Vinimaya Inc. Methods and systems to maintain, check, report, and audit contract and historical pricing in electronic procurement
US10015125B2 (en) * 2014-06-20 2018-07-03 Zinc, Inc. Directory generation and messaging
US10643178B1 (en) 2017-06-16 2020-05-05 Coupa Software Incorporated Asynchronous real-time procurement system
CA3145535A1 (en) * 2021-01-12 2022-07-12 Tealbook Inc. System and method for data profiling

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1997015018A1 (en) * 1995-10-16 1997-04-24 Bell Communications Research, Inc. Method and system for providing uniform access to heterogeneous information
WO1997029414A2 (en) * 1996-02-09 1997-08-14 At & T Corp. Method and apparatus for passively browsing the internet

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5974455A (en) * 1995-12-13 1999-10-26 Digital Equipment Corporation System for adding new entry to web page table upon receiving web page including link to another web page not having corresponding entry in web page table
JPH09311873A (en) * 1996-01-11 1997-12-02 Sony Corp Information providing data structure, information providing method, and information receiving terminal
US5905862A (en) * 1996-09-04 1999-05-18 Intel Corporation Automatic web site registration with multiple search engines
US5933827A (en) * 1996-09-25 1999-08-03 International Business Machines Corporation System for identifying new web pages of interest to a user
US6195657B1 (en) * 1996-09-26 2001-02-27 Imana, Inc. Software, method and apparatus for efficient categorization and recommendation of subjects according to multidimensional semantics
US5958008A (en) * 1996-10-15 1999-09-28 Mercury Interactive Corporation Software system and associated methods for scanning and mapping dynamically-generated web documents
US5974572A (en) * 1996-10-15 1999-10-26 Mercury Interactive Corporation Software system and methods for generating a load test using a server access log
EA199900411A1 (en) * 1996-10-25 2000-02-28 Айпиэф, Инк. SYSTEM AND METHOD OF SERVICE AND DISTRIBUTION THROUGH THE INTERNET INFORMATION RELATING TO CONSUMER GOODS
US6085229A (en) * 1998-05-14 2000-07-04 Belarc, Inc. System and method for providing client side personalization of content of web pages and the like
US6141759A (en) * 1997-12-10 2000-10-31 Bmc Software, Inc. System and architecture for distributing, monitoring, and managing information requests on a computer network
US6151624A (en) * 1998-02-03 2000-11-21 Realnames Corporation Navigating network resources based on metadata
US6401118B1 (en) * 1998-06-30 2002-06-04 Online Monitoring Services Method and computer program product for an online monitoring search engine
US6735585B1 (en) * 1998-08-17 2004-05-11 Altavista Company Method for search engine generating supplemented search not included in conventional search result identifying entity data related to portion of located web page
US6654813B1 (en) * 1998-08-17 2003-11-25 Alta Vista Company Dynamically categorizing entity information

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1997015018A1 (en) * 1995-10-16 1997-04-24 Bell Communications Research, Inc. Method and system for providing uniform access to heterogeneous information
WO1997029414A2 (en) * 1996-02-09 1997-08-14 At & T Corp. Method and apparatus for passively browsing the internet

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6735585B1 (en) 1998-08-17 2004-05-11 Altavista Company Method for search engine generating supplemented search not included in conventional search result identifying entity data related to portion of located web page

Also Published As

Publication number Publication date
EP1105818A1 (en) 2001-06-13
AU5565999A (en) 2000-03-06
WO2000010107A1 (en) 2000-02-24
AU5565899A (en) 2000-03-06
JP5171927B2 (en) 2013-03-27
WO2000010105A1 (en) 2000-02-24
JP2002522847A (en) 2002-07-23
AU5566099A (en) 2000-03-06
JP2011100461A (en) 2011-05-19
US6654813B1 (en) 2003-11-25
US20040267727A1 (en) 2004-12-30
WO2000010108A1 (en) 2000-02-24
US7398266B2 (en) 2008-07-08
AU5566199A (en) 2000-03-06

Similar Documents

Publication Publication Date Title
WO2000010106A1 (en) Mapping information sources
US7136880B2 (en) Method and apparatus for compiling business data
US5941944A (en) Method for providing a substitute for a requested inaccessible object by identifying substantially similar objects using weights corresponding to object features
US6934634B1 (en) Address geocoding
CN101416186B (en) Enhanced search results
US20040220903A1 (en) Method and system to correlate trademark data to internet domain name data
US5659731A (en) Method for rating a match for a given entity found in a list of entities
US8645385B2 (en) System and method for automating categorization and aggregation of content from network sites
US8166013B2 (en) Method and system for crawling, mapping and extracting information associated with a business using heuristic and semantic analysis
US7814089B1 (en) System and method for presenting categorized content on a site using programmatic and manual selection of content items
JP5069285B2 (en) Propagating useful information between related web pages, such as web pages on a website
US6338058B1 (en) Method for providing more informative results in response to a search of electronic documents
US6466940B1 (en) Building a database of CCG values of web pages from extracted attributes
US8306970B2 (en) Method and system for uniquely identifying a person to the exclusion of all others
US20020129062A1 (en) Apparatus and method for cataloging data
US20040162742A1 (en) Data integration method
US6728712B1 (en) System for updating internet address changes
KR20060061307A (en) Method and system for augmenting web content
US7636732B1 (en) Adaptive meta-tagging of websites
US20100205194A1 (en) System and method of identifying relevance of electronic content to location or place
KR20010094228A (en) Fitted multi-searching system for daily information
JP3495253B2 (en) Error elimination method for automatic combination of heterogeneous data having address information and its processing apparatus
EP1254413A2 (en) System and method for database searching
Watters et al. GeoSearcher: Geospatial ranking of search engine results
GB2405497A (en) Search engine

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GE GH HU IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG UZ VN YU ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW SD SL SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase