CA2172874A1 - Method and System for Minimizing Attribute Naming Errors in Set Oriented Duplicate Detection - Google Patents

Method and System for Minimizing Attribute Naming Errors in Set Oriented Duplicate Detection

Info

Publication number
CA2172874A1
CA2172874A1 CA2172874A CA2172874A CA2172874A1 CA 2172874 A1 CA2172874 A1 CA 2172874A1 CA 2172874 A CA2172874 A CA 2172874A CA 2172874 A CA2172874 A CA 2172874A CA 2172874 A1 CA2172874 A1 CA 2172874A1
Authority
CA
Canada
Prior art keywords
list
record
records
duplicate
fields
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CA2172874A
Other languages
French (fr)
Other versions
CA2172874C (en
Inventor
Robert J. Johnson
Shawn W. Szturma
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Pitney Bowes Inc
Original Assignee
Pitney Bowes Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Pitney Bowes Inc filed Critical Pitney Bowes Inc
Publication of CA2172874A1 publication Critical patent/CA2172874A1/en
Application granted granted Critical
Publication of CA2172874C publication Critical patent/CA2172874C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q99/00Subject matter not provided for in other groups of this subclass

Abstract

The invention is a method for detecting duplicate records on a list or in a file and comprises a number of steps. The steps include entering a list, comprised of one or more records, to a data processing system; then, applying a nickname lookup table to the records to determine a common first name. Once a common name has been determined, the method matches a first record from the list with a second record from the list by comparing the fields of the first record with the fields of at least one other record; the comparison is based on a set of pre-determined criteria. The matching sequence determines a duplicate set, wherein the duplicate set is comprised of at least two records with fields that match. The method then lists matching records sequentially so that the system can create a new record by filling each empty field with a next available corresponding field from a subsequent record within the duplicate set. The newly created record is then retained on the original list; and the duplicate records are placed on a second list.
Pre-sorting of the list can occur just prior to the matching sequence as well as just prior to outputting the final list. Additionally, the system operator can be given a number of options to provide flexibility. These options can include:
manually correcting a record on the duplicate records list; deleting an address record from the list of duplicates; or, outputting the record.
CA002172874A 1995-03-30 1996-03-28 Method and system for minimizing attribute naming errors in set oriented duplicate detection Expired - Fee Related CA2172874C (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US08/413,579 1995-03-30
US08/413,579 US5799302A (en) 1995-03-30 1995-03-30 Method and system for minimizing attribute naming errors in set oriented duplicate detection

Publications (2)

Publication Number Publication Date
CA2172874A1 true CA2172874A1 (en) 1996-10-01
CA2172874C CA2172874C (en) 1999-11-23

Family

ID=23637788

Family Applications (1)

Application Number Title Priority Date Filing Date
CA002172874A Expired - Fee Related CA2172874C (en) 1995-03-30 1996-03-28 Method and system for minimizing attribute naming errors in set oriented duplicate detection

Country Status (3)

Country Link
US (1) US5799302A (en)
CA (1) CA2172874C (en)
FR (1) FR2732489B1 (en)

Families Citing this family (41)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5822720A (en) 1994-02-16 1998-10-13 Sentius Corporation System amd method for linking streams of multimedia data for reference material for display
US6026398A (en) * 1997-10-16 2000-02-15 Imarket, Incorporated System and methods for searching and matching databases
US7089331B1 (en) 1998-05-29 2006-08-08 Oracle International Corporation Method and mechanism for reducing client-side memory footprint of transmitted data
CN100492388C (en) 1998-08-14 2009-05-27 3M创新有限公司 Radio frequency identification system applications
US7496854B2 (en) * 1998-11-10 2009-02-24 Arendi Holding Limited Method, system and computer readable medium for addressing handling from a computer program
US7272604B1 (en) 1999-09-03 2007-09-18 Atle Hedloy Method, system and computer readable medium for addressing handling from an operating system
NO984066L (en) * 1998-09-03 2000-03-06 Arendi As Computer function button
US6662079B2 (en) * 1998-11-30 2003-12-09 Pitney Bowes Inc. Method and system for preparation of mailpieces having a capability for processing intermixed qualified and non-qualified mailpieces
US6529896B1 (en) 2000-02-17 2003-03-04 International Business Machines Corporation Method of optimizing a query having an existi subquery and a not-exists subquery
US7389284B1 (en) 2000-02-29 2008-06-17 Oracle International Corporation Method and mechanism for efficient processing of remote-mapped queries
US9070105B2 (en) * 2000-04-21 2015-06-30 United States Postal Service Systems and methods for providing change of address services over a network
US6738768B1 (en) * 2000-06-27 2004-05-18 Johnson William J System and method for efficient information capture
US7013455B1 (en) 2000-10-19 2006-03-14 International Business Machines Corporation System for automatically altering environment variable to run proper executable file by removing references to all except one duplicate file in the path sequence
CA2449702A1 (en) * 2001-06-05 2002-12-12 3M Innovative Properties Company Methods of managing the transfer and use of data
US20020180588A1 (en) * 2001-06-05 2002-12-05 Erickson David P. Radio frequency identification in document management
EP1399871A2 (en) * 2001-06-15 2004-03-24 3M Innovative Properties Company Methods of managing the transfer, use, and importation of data
US20020190862A1 (en) * 2001-06-15 2002-12-19 3M Innovative Properties Company Methods of managing the transfer, use, and importation of data
US7130861B2 (en) 2001-08-16 2006-10-31 Sentius International Corporation Automated creation and delivery of database content
US7103590B1 (en) 2001-08-24 2006-09-05 Oracle International Corporation Method and system for pipelined database table functions
US7092956B2 (en) * 2001-11-02 2006-08-15 General Electric Capital Corporation Deduplication system
US6999975B1 (en) * 2001-11-14 2006-02-14 Cas, Inc. System and method for identifying records with valid address, but invalid name information
US7370044B2 (en) * 2001-11-19 2008-05-06 Equifax, Inc. System and method for managing and updating information relating to economic entities
US7321858B2 (en) * 2001-11-30 2008-01-22 United Negro College Fund, Inc. Selection of individuals from a pool of candidates in a competition system
US7610351B1 (en) 2002-05-10 2009-10-27 Oracle International Corporation Method and mechanism for pipelined prefetching
US6973457B1 (en) 2002-05-10 2005-12-06 Oracle International Corporation Method and system for scrollable cursors
US6968339B1 (en) * 2002-08-20 2005-11-22 Bellsouth Intellectual Property Corporation System and method for selecting data to be corrected
US7103603B2 (en) * 2003-03-28 2006-09-05 International Business Machines Corporation Method, apparatus, and system for improved duplicate record processing in a sort utility
US8423563B2 (en) * 2003-10-16 2013-04-16 Sybase, Inc. System and methodology for name searches
US7295120B2 (en) * 2004-12-10 2007-11-13 3M Innovative Properties Company Device for verifying a location of a radio-frequency identification (RFID) tag on an item
US8204866B2 (en) * 2007-05-18 2012-06-19 Microsoft Corporation Leveraging constraints for deduplication
US20080301184A1 (en) * 2007-05-30 2008-12-04 Pitney Bowes Inc. System and Method for Updating Mailing Lists
US20090167502A1 (en) * 2007-12-31 2009-07-02 3M Innovative Properties Company Device for verifying a location and functionality of a radio-frequency identification (RFID) tag on an item
US8429632B1 (en) 2009-06-30 2013-04-23 Google Inc. Method and system for debugging merged functions within a program
US8458681B1 (en) * 2009-06-30 2013-06-04 Google Inc. Method and system for optimizing the object code of a program
US8682898B2 (en) 2010-04-30 2014-03-25 International Business Machines Corporation Systems and methods for discovering synonymous elements using context over multiple similar addresses
US8683455B1 (en) 2011-01-12 2014-03-25 Google Inc. Method and system for optimizing an executable program by selectively merging identical program entities
US8689200B1 (en) 2011-01-12 2014-04-01 Google Inc. Method and system for optimizing an executable program by generating special operations for identical program entities
US8996548B2 (en) 2011-01-19 2015-03-31 Inmar Analytics, Inc. Identifying consuming entity behavior across domains
US20130007106A1 (en) * 2011-07-01 2013-01-03 Salesforce. Com Inc. Asynchronous interaction in the report generator
US9563677B2 (en) 2012-12-11 2017-02-07 Melissa Data Corp. Systems and methods for clustered matching of records using geographic proximity
US11055327B2 (en) 2018-07-01 2021-07-06 Quadient Technologies France Unstructured data parsing for structured information

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5326181A (en) * 1986-10-14 1994-07-05 Bryce Office Systems Inc. Envelope addressing system adapted to simultaneously print addresses and bar codes
US4858907A (en) * 1986-10-14 1989-08-22 Bryce Office Systems, Inc. System for feeding envelopes for simultaneous printing of addresses and bar codes
US4853882A (en) * 1987-11-02 1989-08-01 A. C. Nielsen Company System and method for protecting against redundant mailings
US5079714A (en) * 1989-10-03 1992-01-07 Pitney Bowes Inc. Mail deliverability by mail and database processing
US5111395A (en) * 1989-11-03 1992-05-05 Smith Rodney A Automated fund collection system including means to eliminate duplicate entries from a mailing list
US5227970A (en) * 1990-07-06 1993-07-13 Bernard C. Harris Publishing Methods and systems for updating group mailing lists
AU9086891A (en) * 1990-11-05 1992-05-26 Johnson & Quin, Inc. Document control and audit apparatus and method
US5245533A (en) * 1990-12-18 1993-09-14 A. C. Nielsen Company Marketing research method and system for management of manufacturer's discount coupon offers
US5428777A (en) * 1991-11-18 1995-06-27 Taylor Publishing Company Automatic index for yearbooks with spell checking capabilities
US5377120A (en) * 1992-06-11 1994-12-27 Humes; Carl L. Apparatus for commingling & addressing mail pieces
US5680611A (en) * 1995-09-29 1997-10-21 Electronic Data Systems Corporation Duplicate record detection

Also Published As

Publication number Publication date
FR2732489B1 (en) 1999-03-19
US5799302A (en) 1998-08-25
FR2732489A1 (en) 1996-10-04
CA2172874C (en) 1999-11-23

Similar Documents

Publication Publication Date Title
CA2172874A1 (en) Method and System for Minimizing Attribute Naming Errors in Set Oriented Duplicate Detection
US5276868A (en) Method and apparatus for pointer compression in structured databases
Chernoff Representations, automorphisms, and derivations of some operator algebras
US7814231B2 (en) Method of synchronizing between three or more devices
KR950012220A (en) Address Translation Mechanism of Virtual Memory Computer Systems Supporting Multiple Page Sizes
US8688659B2 (en) Method for indexed-field based difference detection and correction
CA2049133A1 (en) Methods and apparatus for implementing data bases to provide object-oriented invocation of applications
CA2223933A1 (en) Method for managing globally distributed software components
WO1999023582B1 (en) Methods and apparatus for a universal tracking system
CN104009921B (en) The data message forwarding method matched based on arbitrary fields
GB1514829A (en) Transversal multi-tap equalizers
CN104463460B (en) Processing method and processing device for the waiting information that network data is launched
CN107451177A (en) For the querying method and system of the block chain of the single corrigenda of increase block
US6760737B2 (en) Spatial median filter
EP0400820A3 (en) Content addressable memory
EP0394172A3 (en) Method of performing file services given partial file names
JPH1040255A (en) Hash table control device
CN105608139B (en) Data matching system and method
JPH0315982A (en) Logical simulation system
DE502004006212D1 (en) METHOD FOR PROCESSING CDR INFORMATION
EP3683698A1 (en) Data storage method and system for datasets
JP2560292B2 (en) Vehicle work alignment device
CN106570024A (en) Data increment processing method and apparatus
JPH06290100A (en) Data base system
JPH0221026B2 (en)

Legal Events

Date Code Title Description
EEER Examination request
MKLA Lapsed
MKLA Lapsed

Effective date: 20120328