A system for recognition of characters on a medium. The system includes a scanner for scanning a medium such as a page of printed text and graphics and producing a bit-mapped representation of the page. The bit-mapped representation of the page is then stored in a memory means such as the memory of a computer system. A processor processes the bit-mapped image to produce an output comprising coded character representations of the text on the page. The present invention discloses parsing a page to allow for production of the output characters in a logical sequence, a combination of feature detection methods and template matching methods for recognition of characters and a number of methods for feature detection such as use of statistical data and polygon fitting. |
Citations|
| US3613081 | Nov 3, 1967 | Oct 12, 1971 | | FIG JO | | US3641495 | Aug 12, 1970 | 1972 | | CHARACTER RECOGNITION SYSTEM HAVING | | US3713099 | Aug 4, 1959 | Jan 2, 1973 | | METHOD AND APPARATUS FOR
IDENTIFYING LETTERS,
CHARACTERS, SYMBOLS AND THE
LIKE | | US3713100 | Feb 10, 1953 | Jan 2, 1973 | | VIDEO
AMPLIFIER | | US3930231 | Jun 10, 1974 | 1975 | | PATCH DATA
SR | | US4177448 | Jun 26, 1978 | Dec 4, 1979 | International Business Machines Corporation | Character recognition system and method multi-bit curve vector processing | | US4491965 | Dec 14, 1982 | Jan 1, 1985 | Tokyo Shibaura Denki Kabushiki Kaisha | Character recognition apparatus | | US4589142 | Dec 28, 1983 | May 13, 1986 | International Business Machines Corp. (IBM) | Method and apparatus for character recognition based upon the frequency of occurrence of said characters | | US4611346 | Sep 29, 1983 | Sep 9, 1986 | International Business Machines Corporation | Method and apparatus for character recognition accommodating diacritical marks | | US4741046 | Jul 22, 1985 | Apr 26, 1988 | Konishiroku Photo Industry Co., Ltd. | Method of discriminating pictures | | US4742556 | Sep 16, 1985 | May 3, 1988 | | Character recognition method | | US4860376 | Mar 4, 1988 | Aug 22, 1989 | Sharp Kabushiki Skaisha | Character recognition system for optical character reader | | US4893188 | Jun 17, 1988 | Jan 9, 1990 | Hitachi, Ltd. | Document image entry system | | US5033098 | Aug 27, 1990 | Jul 16, 1991 | Sharp Kabushiki Kaisha | Method of processing character blocks with optical character reader | | US5131053 | Aug 10, 1988 | Jul 14, 1992 | Caere Corporation | Optical character recognition method and apparatus |
Referenced by|
| US6094501 | Apr 30, 1998 | Jul 25, 2000 | Shell Oil Company | Determining article location and orientation using three-dimensional X and Y template edge matrices | | US6268935 | Apr 17, 1995 | Jul 31, 2001 | Minolta Co., Ltd. | Image processor | | US6587103 | Mar 29, 2000 | Jul 1, 2003 | Autodesk, Inc. | Method and apparatus for determining coincident lines | | US6832726 | Dec 12, 2001 | Dec 21, 2004 | ZIH Corp. | Barcode optical character recognition | | US7311256 | Nov 30, 2004 | Dec 25, 2007 | ZIH Corp. | Barcode optical character recognition | | US7657120 | Aug 24, 2007 | Feb 2, 2010 | SRI International | Method and apparatus for determination of text orientation |
Claims1. A system for optically scanning a medium, said medium having thereon an unknown character, said system comprising: - scanning means for scanning said medium, said scanning means providing as output a bit-mapped image of said medium; memory means coupled with said scanning means for storing said bit-processing image;
- processing means coupled with said memory means including means for parsing said bit-mapped image of said medium and providing as output a bit-mapped representation of said unknown character, means for identifying said unknown character and means for analyzing said unknown character based on the surrounding context of said medium;
- said means for analyzing said unknown character based on the surrounding context of said medium includes means for preparing a line of text for context analysis and means for resolving character ambiguities, said means for resolving character ambiguities comprising means for analyzing said line of text to determine spatial information about said line of text in said medium and means for creating attribute data for each character in a line of text.
2. A method for recognizing characters on a medium, said method comprising the steps of: - scanning said medium to produce a bit-mapped image of said medium;
- parsing said bit-mapped image to isolate individual characters and providing as output of said parsing process a bit-mapped image of an unknown character;
- identifying said unknown character; and
- analyzing said unknown character based on the surrounding context of said medium;
- said step of analyzing sad unknown character based on the surrounding context of said medium further comprising the steps of:
- analyzing a line of text in said medium to determine spatial information about said line of text;
- creating attribute data for each character in said line of text; and resolving ambiguities for a character based on said spatial information about said line of text and sad attribute data for each character in a line of text.
3. A system for optically scanning a medium, said medium having thereon an unknown character, said system comprising: - scanning means for scanning said medium, said scanning means providing as output a bit-mapped image of said medium;
- memory means coupled with said scanning means for storing said bit-mapped image;
- processing means coupled with said memory means including means for parsing said bit-mapped image of said medium and providing as output a bit-mapped representation of said unknown character, means for identifying said unknown character and means for analyzing said unknown character based on the surrounding context of said medium;
- said means for analyzing said unknown character based on the surrounding context of said medium includes means for preparing a line of text for context analysis and means for resolving character ambiguities;
- said means for preparing a line of text for context analysis includes means for creating a histogram of the distances between characters in said line of text, means for determining average heights of known characters in said line of text and means for assigning attribute data to each character in said line of text.
4. A system for optically scanning a medium as recited in claim 1, wherein said means for analyzing said unknown character based on the surrounding context of said medium is further comprised of a database of characteristic attributes for known characters. 5. As system for optically scanning a medium as recited in claim 4, wherein said means for resolving character ambiguities is further comprised of means for accessing said database of characteristic attributes for known characters to retrieve characteristic attributes of said unknown character, and means for resolving character ambiguities based on retrieved characteristic attributes. 6. The method as recited in claim 1 wherein said spatial information comprises information describing said line of text's skew, character spacing information, and heights of character information. 7. The method as recited in claim 2 wherein said spatial information comprises information describing said line of text's skew, character spacing information, and heights of character information. |