WO2002021309A3 - Item name normalization - Google Patents

Item name normalization Download PDF

Info

Publication number
WO2002021309A3
WO2002021309A3 PCT/US2001/026629 US0126629W WO0221309A3 WO 2002021309 A3 WO2002021309 A3 WO 2002021309A3 US 0126629 W US0126629 W US 0126629W WO 0221309 A3 WO0221309 A3 WO 0221309A3
Authority
WO
WIPO (PCT)
Prior art keywords
item name
item
name
normalized
variant
Prior art date
Application number
PCT/US2001/026629
Other languages
French (fr)
Other versions
WO2002021309A2 (en
Inventor
Arkady Borkovsky
Original Assignee
Centives Inc E
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Centives Inc E filed Critical Centives Inc E
Priority to AU2001285296A priority Critical patent/AU2001285296A1/en
Priority to EP01964445A priority patent/EP1368745A2/en
Publication of WO2002021309A2 publication Critical patent/WO2002021309A2/en
Publication of WO2002021309A3 publication Critical patent/WO2002021309A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/338Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/9038Presentation of query results
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching
    • Y10S707/99935Query augmenting and refining, e.g. inexact access
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching
    • Y10S707/99936Pattern matching access
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99941Database schema or data structure
    • Y10S707/99944Object-oriented database structure

Abstract

A computer-implemented approach for processing search queries generally involves normalizing names and descriptions of items. The various forms of a name or description of an item is referred to as an item name variant. The normalized form of the name or description of an item is referred to as a normalized item name. Item name variants that are similar are grouped together to form clusters. Each cluster of item name variants is mapped to a normalized item name. A dictionary of normalized item names are created by storing: 1) the item name variant, 2) the information that is obtained from the item source and which is associated with the item name variant, and 3) the mapping information that maps the item name variant to the corresponding normalized item name.
PCT/US2001/026629 2000-09-01 2001-08-23 Item name normalization WO2002021309A2 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
AU2001285296A AU2001285296A1 (en) 2000-09-01 2001-08-23 Item name normalization
EP01964445A EP1368745A2 (en) 2000-09-01 2001-08-23 Item name normalization

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US09/653,040 US6556991B1 (en) 2000-09-01 2000-09-01 Item name normalization
US09/653,040 2000-09-01

Publications (2)

Publication Number Publication Date
WO2002021309A2 WO2002021309A2 (en) 2002-03-14
WO2002021309A3 true WO2002021309A3 (en) 2003-10-02

Family

ID=24619257

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2001/026629 WO2002021309A2 (en) 2000-09-01 2001-08-23 Item name normalization

Country Status (4)

Country Link
US (5) US6556991B1 (en)
EP (2) EP1758035A1 (en)
AU (1) AU2001285296A1 (en)
WO (1) WO2002021309A2 (en)

Families Citing this family (46)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7958224B2 (en) 1999-02-17 2011-06-07 Catalina Marketing Corporation Incentive network for distributing incentives on a network-wide basis and for permitting user access to individual incentives from a plurality of network servers
US6556991B1 (en) * 2000-09-01 2003-04-29 E-Centives, Inc. Item name normalization
US20020188556A1 (en) * 2001-05-02 2002-12-12 James Colica System and method for monitoring and analyzing exposure data
US7707050B2 (en) 2004-03-11 2010-04-27 Risk Management Solutions, Inc. Systems and methods for determining concentrations of exposure
US8229766B2 (en) 2004-07-30 2012-07-24 Risk Management Solutions, Inc. System and method for producing a flexible geographical grid
US20080162480A1 (en) * 2004-06-14 2008-07-03 Symphonyrpm, Inc. Decision object for associating a plurality of business plans
US8244689B2 (en) * 2006-02-17 2012-08-14 Google Inc. Attribute entropy as a signal in object normalization
US7769579B2 (en) * 2005-05-31 2010-08-03 Google Inc. Learning facts from semi-structured text
US7890503B2 (en) * 2005-02-07 2011-02-15 Microsoft Corporation Method and system for performing secondary search actions based on primary search result attributes
US8682913B1 (en) 2005-03-31 2014-03-25 Google Inc. Corroborating facts extracted from multiple sources
US7587387B2 (en) 2005-03-31 2009-09-08 Google Inc. User interface for facts query engine with snippets from information sources that include query terms and answer terms
US9208229B2 (en) * 2005-03-31 2015-12-08 Google Inc. Anchor text summarization for corroboration
US7447647B1 (en) * 2005-05-02 2008-11-04 Shedlack David G Techniques and definition logic embodied in a computer program product stored and performed on a computerized device for providing a singular graphical user interface configured to enable a user to create/manage/transact/report and view all full granular reference product data in a configurable transactable aggregate form
US8996470B1 (en) 2005-05-31 2015-03-31 Google Inc. System for ensuring the internal consistency of a fact repository
US7627515B2 (en) * 2005-06-28 2009-12-01 Microsoft Corporation Price determination for items of low demand
US20070112814A1 (en) * 2005-11-12 2007-05-17 Cheshire Stuart D Methods and systems for providing improved security when using a uniform resource locator (URL) or other address or identifier
US7991797B2 (en) 2006-02-17 2011-08-02 Google Inc. ID persistence through normalization
US8260785B2 (en) 2006-02-17 2012-09-04 Google Inc. Automatic object reference identification and linking in a browseable fact repository
US8700568B2 (en) 2006-02-17 2014-04-15 Google Inc. Entity normalization via name normalization
US7702631B1 (en) 2006-03-14 2010-04-20 Google Inc. Method and system to produce and train composite similarity functions for product normalization
US8122026B1 (en) 2006-10-20 2012-02-21 Google Inc. Finding and disambiguating references to entities on web pages
US20080103887A1 (en) * 2006-10-31 2008-05-01 Google Inc. Selecting advertisements based on consumer transactions
US8347202B1 (en) 2007-03-14 2013-01-01 Google Inc. Determining geographic locations for place names in a fact repository
US8239350B1 (en) 2007-05-08 2012-08-07 Google Inc. Date ambiguity resolution
US7966291B1 (en) 2007-06-26 2011-06-21 Google Inc. Fact-based object merging
US7970766B1 (en) 2007-07-23 2011-06-28 Google Inc. Entity type assignment
US8738643B1 (en) 2007-08-02 2014-05-27 Google Inc. Learning synonymous object names from anchor texts
CA2602309C (en) 2007-09-13 2015-10-13 Semiconductor Insights, Inc. A method of bibliographic field normalization
US8812435B1 (en) 2007-11-16 2014-08-19 Google Inc. Learning objects and facts from documents
US9135396B1 (en) * 2008-12-22 2015-09-15 Amazon Technologies, Inc. Method and system for determining sets of variant items
US9443209B2 (en) * 2009-04-30 2016-09-13 Paypal, Inc. Recommendations based on branding
US20120130828A1 (en) * 2010-11-23 2012-05-24 Cooley Robert W Source of decision considerations for managing advertising pricing
US11507548B1 (en) * 2011-09-21 2022-11-22 Amazon Technologies, Inc. System and method for generating a classification model with a cost function having different penalties for false positives and false negatives
US8793201B1 (en) * 2011-10-27 2014-07-29 Amazon Technologies, Inc. System and method for seeding rule-based machine learning models
US10467322B1 (en) * 2012-03-28 2019-11-05 Amazon Technologies, Inc. System and method for highly scalable data clustering
US20130297634A1 (en) * 2012-05-07 2013-11-07 Sap Ag Entity Name Variant Generator
CN103578015A (en) * 2012-08-07 2014-02-12 阿里巴巴集团控股有限公司 Method and device for achieving commodity attribute navigation
US10140621B2 (en) * 2012-09-20 2018-11-27 Ebay Inc. Determining and using brand information in electronic commerce
US10210553B2 (en) * 2012-10-15 2019-02-19 Cbs Interactive Inc. System and method for managing product catalogs
US9426053B2 (en) 2012-12-06 2016-08-23 International Business Machines Corporation Aliasing of named data objects and named graphs for named data networks
US9251292B2 (en) 2013-03-11 2016-02-02 Wal-Mart Stores, Inc. Search result ranking using query clustering
US8831969B1 (en) * 2013-10-02 2014-09-09 Linkedin Corporation System and method for determining users working for the same employers in a social network
US10318702B2 (en) 2016-01-19 2019-06-11 Ford Motor Company Multi-valued decision diagram reversible restriction
EP3928467A1 (en) 2019-02-22 2021-12-29 Telefonaktiebolaget LM Ericsson (publ) Managing telecommunication network event data
US10726176B1 (en) 2019-05-15 2020-07-28 Bqr Reliability Engineering Ltd. Method and apparatus for designing electrical and electronic circuits
US20230055163A1 (en) * 2021-08-19 2023-02-23 Maplebear Inc. (Dba Instacart) Verifying matches between identifiers stored in a digital catalog

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5781772A (en) * 1989-07-12 1998-07-14 Digital Equipment Corporation Compressed prefix matching database searching
US5960430A (en) * 1996-08-23 1999-09-28 General Electric Company Generating rules for matching new customer records to existing customer records in a large database
JP2000172722A (en) * 1998-12-01 2000-06-23 Korea Electronics Telecommun Method and system for product information automatic indexing of on-line store

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU631276B2 (en) * 1989-12-22 1992-11-19 Bull Hn Information Systems Inc. Name resolution in a directory database
JP3175399B2 (en) * 1993-05-18 2001-06-11 セイコーエプソン株式会社 Card data management device
JP3652376B2 (en) * 1995-06-07 2005-05-25 インターナショナル・ビジネス・マシーンズ・コーポレーション Methodology for creating object structures for accessing traditional non-object oriented business applications
US5832480A (en) * 1996-07-12 1998-11-03 International Business Machines Corporation Using canonical forms to develop a dictionary of names in a text
US6243709B1 (en) * 1998-06-29 2001-06-05 Sun Microsystems, Inc. Method and apparatus for loading stored procedures in a database corresponding to object-oriented data dependencies
US6115690A (en) * 1997-12-22 2000-09-05 Wong; Charles Integrated business-to-business Web commerce and business automation system
US6334131B2 (en) * 1998-08-29 2001-12-25 International Business Machines Corporation Method for cataloging, filtering, and relevance ranking frame-based hierarchical information structures
US6546381B1 (en) * 1998-11-02 2003-04-08 International Business Machines Corporation Query optimization system and method
US6308149B1 (en) * 1998-12-16 2001-10-23 Xerox Corporation Grouping words with equivalent substrings by automatic clustering based on suffix relationships
US20020004749A1 (en) * 2000-02-09 2002-01-10 Froseth Barrie R. Customized food selection, ordering and distribution system and method
US6556991B1 (en) 2000-09-01 2003-04-29 E-Centives, Inc. Item name normalization
US7657540B1 (en) * 2003-02-04 2010-02-02 Seisint, Inc. Method and system for linking and delinking data records
US7146361B2 (en) * 2003-05-30 2006-12-05 International Business Machines Corporation System, method and computer program product for performing unstructured information management and automatic text analysis, including a search operator functioning as a Weighted AND (WAND)
US7881505B2 (en) * 2006-09-29 2011-02-01 Pittsburgh Pattern Recognition, Inc. Video retrieval system for human face content

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5781772A (en) * 1989-07-12 1998-07-14 Digital Equipment Corporation Compressed prefix matching database searching
US5960430A (en) * 1996-08-23 1999-09-28 General Electric Company Generating rules for matching new customer records to existing customer records in a large database
JP2000172722A (en) * 1998-12-01 2000-06-23 Korea Electronics Telecommun Method and system for product information automatic indexing of on-line store

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
CARRICK C ET AL: "Automatic association of news items", INFORMATION PROCESSING & MANAGEMENT, ELSEVIER, BARKING, GB, vol. 33, no. 5, 1 September 1997 (1997-09-01), pages 615 - 632, XP004089624, ISSN: 0306-4573 *
PATENT ABSTRACTS OF JAPAN vol. 2000, no. 09 13 October 2000 (2000-10-13) *

Also Published As

Publication number Publication date
US20090222425A1 (en) 2009-09-03
US8762361B2 (en) 2014-06-24
US6556991B1 (en) 2003-04-29
US7542964B2 (en) 2009-06-02
EP1758035A1 (en) 2007-02-28
US6853996B1 (en) 2005-02-08
US8037019B2 (en) 2011-10-11
EP1368745A2 (en) 2003-12-10
WO2002021309A2 (en) 2002-03-14
US20050187951A1 (en) 2005-08-25
AU2001285296A1 (en) 2002-03-22
US20120030205A1 (en) 2012-02-02

Similar Documents

Publication Publication Date Title
WO2002021309A3 (en) Item name normalization
MXPA05009733A (en) System and method to acquire information from a database.
EP1624386A3 (en) Searching for data objects
WO2005008529A8 (en) Optimized sql code generation
WO2004036365A3 (en) Dividing a travel query into sub-queries
CA2327866A1 (en) Table searching technique
WO2002056196A3 (en) Creation of structured data from plain text
BR9911931A (en) Computer-implemented database, systems for storing and retrieving tuples and for storing a plurality of tuples, and processes for retrieving a record from a compacted database and for storing instances of a plurality of values
SG146633A1 (en) Methods and systems for improving a search ranking using related queries
WO2005109180A3 (en) Two-stage data validation and mapping for database access
WO2001093111A3 (en) Generating multidimensional output using meta-models and meta-outline
WO2001003002A3 (en) Meta-descriptor for multimedia information
ES2136253T3 (en) INTELLIGENT DATA STORE.
WO2002099701A3 (en) Consistent read in a distributed database environment
WO2005017682A3 (en) Product placement engine and method
EP1383056A3 (en) Querying an object-relational database system
BR0111192A (en) System and method for automatically generating database queries
WO2004068292A3 (en) Xml types in java
BRPI0600547A (en) mapping a file system model to a database object
WO2004095428A3 (en) Index and query processor for data and information retrieval, integration and sharing from multiple disparate data sources
EP1476826A4 (en) Similarity search engine for use with relational databases
WO2004038608A3 (en) Navigation tool for exploring a knowledge base
EP0955592A3 (en) A system and method for querying a music database
EP1600861A3 (en) Query to task mapping
EP1081611A3 (en) Query engine and method for Querying data using metadata model

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PH PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
WWE Wipo information: entry into national phase

Ref document number: 2001964445

Country of ref document: EP

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

WWP Wipo information: published in national office

Ref document number: 2001964445

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: JP

DPE2 Request for preliminary examination filed before expiration of 19th month from priority date (pct application filed from 20040101)