Search Images Maps Play YouTube News Gmail Drive More »
Advanced Patent Search | Web History | Sign in

Patents

A system for recognition of characters on a medium. The system includes a scanner for scanning a medium such as a page of printed text and graphics and producing a bit-mapped representation of the page. The bit-mapped representation of the page is then stored in a memory means such as the memory of a computer system. A processor processes the bit-mapped image to produce an output comprising coded character representations of the text on the page. The present invention discloses parsing a page to allow for production of the output characters in a logical sequence, a combination of feature detection methods and template matching methods for recognition of characters and a number of methods for feature detection such as use of statistical data and polygon fitting.

InventorsPhilip Bernzott, John Dilworth, David George, Bryan Higgins, Jeremy Knight
Original AssigneeCaere Corporation
Primary Examiner: Michael Cammarata
Current U.S. Classification382/176; 382/209; 382/229
International Classification: G06K 972; G06K 934; G06K 968

View patent at USPTO
Search USPTO Assignment Database

Citations

Cited PatentFiling dateIssue dateOriginal AssigneeTitle
US3613081Nov 3, 1967Oct 12, 1971FIG JO
US3641495Aug 12, 19701972CHARACTER RECOGNITION SYSTEM HAVING
US3713099Aug 4, 1959Jan 2, 1973METHOD AND APPARATUS FOR IDENTIFYING LETTERS, CHARACTERS, SYMBOLS AND THE LIKE
US3713100Feb 10, 1953Jan 2, 1973VIDEO AMPLIFIER
US3930231Jun 10, 19741975PATCH DATA SR
US4177448Jun 26, 1978Dec 4, 1979International Business Machines CorporationCharacter recognition system and method multi-bit curve vector processing
US4491965Dec 14, 1982Jan 1, 1985Tokyo Shibaura Denki Kabushiki KaishaCharacter recognition apparatus
US4589142Dec 28, 1983May 13, 1986International Business Machines Corp. (IBM)Method and apparatus for character recognition based upon the frequency of occurrence of said characters
US4611346Sep 29, 1983Sep 9, 1986International Business Machines CorporationMethod and apparatus for character recognition accommodating diacritical marks
US4741046Jul 22, 1985Apr 26, 1988Konishiroku Photo Industry Co., Ltd.Method of discriminating pictures
US4742556Sep 16, 1985May 3, 1988Character recognition method
US4860376Mar 4, 1988Aug 22, 1989Sharp Kabushiki SkaishaCharacter recognition system for optical character reader
US4893188Jun 17, 1988Jan 9, 1990Hitachi, Ltd.Document image entry system
US5033098Aug 27, 1990Jul 16, 1991Sharp Kabushiki KaishaMethod of processing character blocks with optical character reader
US5131053Aug 10, 1988Jul 14, 1992Caere CorporationOptical character recognition method and apparatus

Referenced by

Citing PatentFiling dateIssue dateOriginal AssigneeTitle
US6094501Apr 30, 1998Jul 25, 2000Shell Oil CompanyDetermining article location and orientation using three-dimensional X and Y template edge matrices
US6268935Apr 17, 1995Jul 31, 2001Minolta Co., Ltd.Image processor
US6587103Mar 29, 2000Jul 1, 2003Autodesk, Inc.Method and apparatus for determining coincident lines
US6832726Dec 12, 2001Dec 21, 2004ZIH Corp.Barcode optical character recognition
US7311256Nov 30, 2004Dec 25, 2007ZIH Corp.Barcode optical character recognition
US7657120Aug 24, 2007Feb 2, 2010SRI InternationalMethod and apparatus for determination of text orientation

Claims

1. A system for optically scanning a medium, said medium having thereon an unknown character, said system comprising:

scanning means for scanning said medium, said scanning means providing as output a bit-mapped image of said medium; memory means coupled with said scanning means for storing said bit-processing image;
processing means coupled with said memory means including means for parsing said bit-mapped image of said medium and providing as output a bit-mapped representation of said unknown character, means for identifying said unknown character and means for analyzing said unknown character based on the surrounding context of said medium;
said means for analyzing said unknown character based on the surrounding context of said medium includes means for preparing a line of text for context analysis and means for resolving character ambiguities, said means for resolving character ambiguities comprising means for analyzing said line of text to determine spatial information about said line of text in said medium and means for creating attribute data for each character in a line of text.

2. A method for recognizing characters on a medium, said method comprising the steps of:

scanning said medium to produce a bit-mapped image of said medium;
parsing said bit-mapped image to isolate individual characters and providing as output of said parsing process a bit-mapped image of an unknown character;
identifying said unknown character; and
analyzing said unknown character based on the surrounding context of said medium;
said step of analyzing sad unknown character based on the surrounding context of said medium further comprising the steps of:
analyzing a line of text in said medium to determine spatial information about said line of text;
creating attribute data for each character in said line of text; and resolving ambiguities for a character based on said spatial information about said line of text and sad attribute data for each character in a line of text.

3. A system for optically scanning a medium, said medium having thereon an unknown character, said system comprising:

scanning means for scanning said medium, said scanning means providing as output a bit-mapped image of said medium;
memory means coupled with said scanning means for storing said bit-mapped image;
processing means coupled with said memory means including means for parsing said bit-mapped image of said medium and providing as output a bit-mapped representation of said unknown character, means for identifying said unknown character and means for analyzing said unknown character based on the surrounding context of said medium;
said means for analyzing said unknown character based on the surrounding context of said medium includes means for preparing a line of text for context analysis and means for resolving character ambiguities;
said means for preparing a line of text for context analysis includes means for creating a histogram of the distances between characters in said line of text, means for determining average heights of known characters in said line of text and means for assigning attribute data to each character in said line of text.

4. A system for optically scanning a medium as recited in claim 1, wherein said means for analyzing said unknown character based on the surrounding context of said medium is further comprised of a database of characteristic attributes for known characters.

5. As system for optically scanning a medium as recited in claim 4, wherein said means for resolving character ambiguities is further comprised of means for accessing said database of characteristic attributes for known characters to retrieve characteristic attributes of said unknown character, and means for resolving character ambiguities based on retrieved characteristic attributes.

6. The method as recited in claim 1 wherein said spatial information comprises information describing said line of text's skew, character spacing information, and heights of character information.

7. The method as recited in claim 2 wherein said spatial information comprises information describing said line of text's skew, character spacing information, and heights of character information.