WO1991017519A1 - Row-by-row segmentation and thresholding for optical character recognition - Google Patents

Row-by-row segmentation and thresholding for optical character recognition Download PDF

Info

Publication number
WO1991017519A1
WO1991017519A1 PCT/US1991/003064 US9103064W WO9117519A1 WO 1991017519 A1 WO1991017519 A1 WO 1991017519A1 US 9103064 W US9103064 W US 9103064W WO 9117519 A1 WO9117519 A1 WO 9117519A1
Authority
WO
WIPO (PCT)
Prior art keywords
pixels
image
threshold
row
rows
Prior art date
Application number
PCT/US1991/003064
Other languages
French (fr)
Inventor
Hin-Leong Tan
Original Assignee
Eastman Kodak Company
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Eastman Kodak Company filed Critical Eastman Kodak Company
Publication of WO1991017519A1 publication Critical patent/WO1991017519A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/16Image preprocessing
    • G06V30/162Quantising the image signal
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Definitions

  • the invention is related to optical character recognition (OCR) systems for microfilm readers and in particular to OCR systems which read microfilm frame numbers.
  • Each document on a roll of microfilm has an associated frame number which serves as a reference to its position on the film.
  • the current method of verifying that a specific document is located is to have a human operator visually read the frame number. It is desirable to have a system for optically reading the frame numbers automatically.
  • the OCR system must first segment each character in the frame number.
  • the numerals of each frame number are printed in a vertical column. Therefore, once the imaginary box bounding the frame numerals is located, segmentation in a vertical direction only is sufficient. This fact is exploited in the invention to solve the problem of controlling the sensitivity of the microfilm scanner, as will be described below.
  • the scanner generates an array of pixels representing the image of the scanned document, each pixel comprising, for example, an eight-bit word specifying one of 256 possible gray levels from black to white. Assuming each character is a white symbol printed on a black background, the OCR system employs a selected sensitivity threshold somewhere within the range of the 256 gray levels. Any pixel whose gray level is below the threshold is considered to be part of the background. The foregoing can be performed by the OCR system in the digital domain or by the scanner in the analog domain.
  • U.S. Patent No. 3,263,216 discloses the concept of varying the analog (gray level) threshold for scanning printed characters on a document in response to a failure of the pattern recognition system to recognize the current character, scanning the character again using the altered threshold and attempting to recognize the same character. Subsequent attempts to recognize the character may be made while either increasing or decreasing the threshold.
  • U.S. Patent No.'s 4,742,556 discloses image gray level thresholding control based on pixel gray levels in a local neighborhood.
  • U.S. Patent No.'s 4,490,852 discloses image gray level threshold control based upon gray level peaks in the image.
  • U.S. Patent No. 4,829,587 discloses increasing the resolution of a binary image by interpolating with gray level pixels to smooth the transitions between binary black and white pixels.
  • U.S. Patent No. 3,868,637 discloses a microfilm frame number reading device using diffraction patterns of a laser beam.
  • the present invention vertically segments each numeral in the microfilm frame number from its predecessor while simultaneously determining the likeliest optimum value for the sensitivity threshold before reading the numeral, based upon results obtained during the reading of the preceding numeral and obtained during the segmentation process. This is performed in such a manner that the threshold is initialized at an optimal minimum value. Accordingly, correction of the threshold following failed attempts to read the next character are always performed by increasing the threshold. Therefore, such corrections need not be characterized by the random attempts at increasing and subsequently decreasing the threshold, which characterize the prior art.
  • the invention is an OCR system for reading microfilm frame numbers.
  • the system of the invention first locates the imaginary box bounding the frame number for a given microfilm frame in accordance with uniform microfilm specifications by detecting a standard fiducial mark in the microfilm image.
  • the bounding box is a vertical rectangle within which the frame numerals are in spaced vertical alignment.
  • the system then extracts and reads each numeral one by one beginning at the top of the bounding box.
  • the invention predicts for the next character an optimal minimum gray level sensitivity threshold while scanning pixel rows between characters by lowering the threshold unless or until noise in the image causes the scanner to detect too many "ON" pixels between adjacent numerals.
  • the invention computes a confidence value and increases the sensitivity threshold if the confidence value is insufficient before the next attempt to recognize the same character.
  • the system After successfully reading the current numeral, the system begins extraction of the next numeral. First, the system proceeds to a location which is a predetermined number of rows (e.g., 4 rows) below the bottom of the numeral previously extracted, which is certain to be between the current numeral and the next numeral. Then, the system determines whether the current row contains less than a predetermined number of pixels (e.g., 5 pixels) whose gray levels are above the sensitivity threshold. If so, the system decreases the gray level sensitivity threshold by a predetermined decrement and repeats this process until either the number of pixels whose gray levels are above the sensitivity threshold exceeds the predetermined number or a minimum threshold value is reached.
  • a predetermined number of rows e.g., 4 rows
  • the system determines whether the current row contains less than a predetermined number of pixels (e.g., 5 pixels) whose gray levels are above the sensitivity threshold. If so, the system decreases the gray level sensitivity threshold by a predetermined decrement and repeats this process
  • the system searches for the top pixel row of the next numeral by looking for the next row having more than another predetermined number of pixels (e.g., 10 pixels) whose gray levels are above the current threshold.
  • the system searches for the next row having less than the predetermined number of pixels (e.g., 10 pixels) whose gray levels are above the threshold.
  • This second pixel row thus identified is the bottom of the numeral.
  • the image between the top and bottom rows thus identified is then extracted as the next character image to be read. In extracting the image, any pixel whose gray level is below the sensitivity threshold is considered to be a part of the black background and not part of a character stroke. In effect, those pixels having non-zero gray levels below the threshold are discarded as noise.
  • Each extracted character image is read by a feature- based character recognition device, such as a feature- based vector distance recognition device.
  • the system also computes a confidence value proportional to the difference between the distances from the vector representing the extracted character image and the two closest reference vectors representing known symbols.
  • the microfilm frame number OCR system of the present invention determines whether the confidence level is above a minimum value. If the confidence value is not above the minimum level, the system increases the gray level sensitivity threshold by a predetermined increment and repeats the foregoing extraction process pixel row by pixel row. The cycle is repeated until a satisfactory confidence level is attained.
  • the feature-based character recognition device is of the type disclosed in U.S. Patent Application Serial No. filed by Hin Leong Tan entitled "A FEATURE-
  • the invention predicts the lowest optimum gray level sensitivity threshold for reading the next character, in response to noise in the empty portions of the image.
  • the invention increases the threshold from the predicted value to the minimum extent necessary, in response to a computed confidence value.
  • Fig. 1 is a pictorial diagram illustrating the operation of the invention in one example
  • Fig. 2 is a block diagram of a system embodying the invention
  • Fig. 3 is a flow diagram illustrating the operation of the system of Fig. 2;
  • Fig. 4 is a graph depicting fluctuation of the gray level sensitivity under control by the system of Fig. 2 in the example depicted in Fig. 1.
  • a microfilm strip 100 has a fiducial frame mark 102 which locates an imaginary bounding box containing vertically aligned numerals 106, 108, 110 defining the microfilm frame number.
  • the invention separates and extracts the image of each individual numeral 106, 108, 110 by processing one by one each horizontal row 112 of image pixels.
  • a system embodying the invention is illustrated in Fig. 2.
  • a microfilm scanner 200 scans the microfilm strip 100 of Fig. 1 to generate an array of horizontal rows and vertical columns of image pixels, each pixel having a gray level value within a predetermined range. For example, if each pixel is an eight bit byte, then its gray level may be any one of 256 possible values.
  • a gray level image memory 202 stores the digital image generated by the scanner 200.
  • a comparator 204 converts each pixel to one of two binary values ("ON" and "OFF") depending upon whether the pixel's gray level is above or below a sensitivity threshold stored in a buffer 206, and stores the results in a bi-tonal image memory 208.
  • a feature-based recognition processor 210 performs feature-based optical character recognition on the bi-tonal image stored in the memory 208 and identifies the image with a known symbol while simultaneously computing a confidence value associated with the identification.
  • a control processor 212 responds to the contents of the bi-tonal image memory during character separation or extraction to correct the gray level sensitivity threshold stored in the buffer 206. The control processor 212 responds to the confidence value generated by the recognition processor 210 during the recognition process performed by the processor 210 to correct the sensitivity threshold stored in the buffer 206.
  • the recognition processor 210 implements the system described in U.S. Patent Application Serial No. filed by Hin Leong Tan entitled "A FEATURE-BASED AND TEMPLATE MATCHING OPTICAL CHARACTER RECOGNITION SYSTEM” assigned to the assignee of the present application.
  • Hin Leong Tan entitled "A FEATURE-BASED AND TEMPLATE MATCHING OPTICAL CHARACTER RECOGNITION SYSTEM” assigned to the assignee of the present application.
  • the referenced patent application teaches aligning each character image into one corner of the image frame, in the present invention it is preferred to center each character in the image frame.
  • the gray level sensitivity threshold value in the buffer 206 is initialized at an intermediate level between the two extremes of the gray level range of the scanner 200 (block 300 of Fig. 3). For example, if the gray level scale of the scanner is 0 to 255 from black to white, the initial level is set to 175. On the other hand, if the gray level scale is 255 to 0 from black to white, the initial threshold level is set to 125. In the preferred embodiment of the invention, this initial threshold level is also the minimum level below which the system will not permit the threshold to be decreased.
  • the control processor 212 sets a start pointer to the top pixel row 112 of the bounding box 104 of Fig. 1. This corresponds to top row address of the memory 202.
  • the processor 212 then fetches from the gray level image memory the horizontal pixel rows between the rows 112a and 112b bounding the top and bottom of the first numeral 106 (block 304 of Fig. 3) .
  • Each pixel thus fetched is compared by the comparator with the sensitivity threshold (206) and the binary result (either "ON” or "OFF”) is stored in a corresponding location in the bi tonal image memory.
  • the comparator with the sensitivity threshold (206) and the binary result (either "ON" or "OFF" is stored in a corresponding location in the bi tonal image memory.
  • the top and bottom bounding rows 112a and 112b are found in the step of block 304 by searching down from the first (top) row in the image for the first and last horizontal pixel rows, respectively, whose corresponding bi-tonal image in the memory 208 contains more than a predetermined number of "ON" pixels (preferably, 10 pixels) .
  • the recognition processor 210 then generates a vector from the bi-tonal image data stored in the memory 208 and computes the distances between this vector and the two closest reference vectors stored in a reference vector memory 214 (block 308 of Fig. 3) .
  • the recognition processor 210 transmits the ratio of the two distances as a confidence value to the control processor 212, which compares it with a predetermined acceptance level (block 310 of Fig. 3) . If the confidence value is below the predetermined acceptance level (FALSE branch of block
  • the control processor 212 increases the contents of the sensitivity threshold buffer 206 by a predetermined increment (block 312 of Fig.) and empties the contents of the bi-tonal memory 208.
  • the system then returns to the step of block 304 and repeats the succeeding steps, so that the gray level image in the memory 202 is again converted to a bi-tonal image accumulated in the memory 208, but with a higher gray level threshold value being used by the comparator 204.
  • an acceptable confidence level is obtained (TRUE branch of block 310 of Fig. 3) so that the recognition processor 210 identifies the current character image with the symbol corresponding to the closest reference vector.
  • the control processor 212 sets its start pointer four rows below the bottom bounding horizontal row of the previous character found in the step of block 304. If the previous character was the "6" numeral 106 of Fig. 1, then the new start pointer location is four horizontal pixel rows below the row 112b. This guarantees that the pointer is now between adjacent characters where there ideally are no "ON" pixels. Then the control processor
  • the control processor 212 counts the number of "ON" pixels in this row. If the number is less than a predetermined number (preferably, 5) , this means that the current gray level threshold is more than sufficient to handle local background noise level (YES branch of block 316 of Fig. 3). In this case the control processor 212 decreases the contents of the gray level threshold buffer 206 by a predetermined decrement (block 318 of Fig. 3) .
  • the control processor 212 keeps count of how many times it has incremented the gray level threshold in response to a poor confidence value. If the count ever exceeds a certain number, the control processor declares a failure and causes the system to begin reading the next character.
  • Fig. 4 illustrates how the contents of the gray level threshold buffer 206 fluctuates in the presence of noise in the image, such as a "blob" 114 shown in Fig. 1. If the blob 114 is due to dirt, for example, the pixels in the gray level image stored in the memory 202 which represent the blob tend to have gray levels less than that of the numerals 106, 108, 110. In attempting to read the "1" numeral 108, the recognition processor 210 continues to produce low confidence scores during successive attempts. With each attempt, the control processor 212 raises the gray level threshold until the gray level of the blob 114 is exceeded, and it disappears in the bi-tonal image. This corresponds to the portion of the graph of Fig.
  • the control processor 212 discovers that the threshold level is well above that required to maintain the count of "ON" pixels in the selected row. Thus, over succeeding iterations, the threshold is decremented back to a minimum level. This corresponds to the portion of the graph of Fig. 4 in which the threshold value decreases over several iterations.
  • ⁇ thres thres_array[++thres_index] ;
  • edge() ⁇ int i, j, ml, nl;

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Character Input (AREA)
  • Character Discrimination (AREA)

Abstract

During segmentation of individual characters in a column, the invention predicts for the next character an optimal minimum gray level sensitivity threshold while scanning pixel rows between characters by lowering the threshold unless or until noise in the image causes the scanner to detect too many 'ON' pixels between adjacent numerals. During subsequent attempts to recognize the segmented character, the invention computes a confidence score and increases the sensitivity threshold above the predicted level if the confidence score is insufficient.

Description

ROW-BY-ROW SEGMENTATION AND THRESHOLDING FOR OPTICAL CHARACTER RECOGNITION
BACKGROUND OF THE INVENTION Technical Field; The invention is related to optical character recognition (OCR) systems for microfilm readers and in particular to OCR systems which read microfilm frame numbers.
Background Art;
Each document on a roll of microfilm has an associated frame number which serves as a reference to its position on the film. The current method of verifying that a specific document is located is to have a human operator visually read the frame number. It is desirable to have a system for optically reading the frame numbers automatically.
One problem in attempting to read character images - -such as microfilm frame numbers— is that the sensitivity of the document scanner must be adjusted to optimize the performance of the OCR system. If the sensitivity is too low, the OCR system may not be able to read faint characters, while if the sensitivity is too high the OCR system may be confused by noise in the image, such noise arising from dirt or fingerprints on the microfilm, for example.
The OCR system must first segment each character in the frame number. In the typical microfilm image, the numerals of each frame number are printed in a vertical column. Therefore, once the imaginary box bounding the frame numerals is located, segmentation in a vertical direction only is sufficient. This fact is exploited in the invention to solve the problem of controlling the sensitivity of the microfilm scanner, as will be described below.
Various techniques are known for selecting the sensitivity of a document scanner. In most of them, the scanner generates an array of pixels representing the image of the scanned document, each pixel comprising, for example, an eight-bit word specifying one of 256 possible gray levels from black to white. Assuming each character is a white symbol printed on a black background, the OCR system employs a selected sensitivity threshold somewhere within the range of the 256 gray levels. Any pixel whose gray level is below the threshold is considered to be part of the background. The foregoing can be performed by the OCR system in the digital domain or by the scanner in the analog domain.
U.S. Patent No. 3,263,216 (Andrews) discloses the concept of varying the analog (gray level) threshold for scanning printed characters on a document in response to a failure of the pattern recognition system to recognize the current character, scanning the character again using the altered threshold and attempting to recognize the same character. Subsequent attempts to recognize the character may be made while either increasing or decreasing the threshold.
U.S. Patent No.'s 4,742,556 (Davis, Jr., et al.) and 4,593,325 (Kannapell et al.) disclose image gray level thresholding control based on pixel gray levels in a local neighborhood. U.S. Patent No.'s 4,490,852 (Sahni) and 4,731,863 (Sezan et al.) disclose image gray level threshold control based upon gray level peaks in the image.
U.S. Patent No. 4,829,587 (Glazer et al.) discloses increasing the resolution of a binary image by interpolating with gray level pixels to smooth the transitions between binary black and white pixels.
U.S. Patent No. 3,868,637 (Schiller) discloses a microfilm frame number reading device using diffraction patterns of a laser beam.
None of the foregoing references provides a way of initializing the sensitivity threshold at the likeliest optimum value before attempting to read a character. Therefore, as in the patent to Andrews referenced above, there is no reliable way to determine whether the threshold should be increased or decreased upon encountering poor results.
In contrast, the present invention vertically segments each numeral in the microfilm frame number from its predecessor while simultaneously determining the likeliest optimum value for the sensitivity threshold before reading the numeral, based upon results obtained during the reading of the preceding numeral and obtained during the segmentation process. This is performed in such a manner that the threshold is initialized at an optimal minimum value. Accordingly, correction of the threshold following failed attempts to read the next character are always performed by increasing the threshold. Therefore, such corrections need not be characterized by the random attempts at increasing and subsequently decreasing the threshold, which characterize the prior art.
DISCLOSURE OF THE INVENTION The invention is an OCR system for reading microfilm frame numbers. The system of the invention first locates the imaginary box bounding the frame number for a given microfilm frame in accordance with uniform microfilm specifications by detecting a standard fiducial mark in the microfilm image. The bounding box is a vertical rectangle within which the frame numerals are in spaced vertical alignment. The system then extracts and reads each numeral one by one beginning at the top of the bounding box.
During extraction of individual characters or numerals, the invention predicts for the next character an optimal minimum gray level sensitivity threshold while scanning pixel rows between characters by lowering the threshold unless or until noise in the image causes the scanner to detect too many "ON" pixels between adjacent numerals. During each attempted recognition of an extracted character or numeral, the invention computes a confidence value and increases the sensitivity threshold if the confidence value is insufficient before the next attempt to recognize the same character.
Threshold Correction Durinσ Character Extraction;
After successfully reading the current numeral, the system begins extraction of the next numeral. First, the system proceeds to a location which is a predetermined number of rows (e.g., 4 rows) below the bottom of the numeral previously extracted, which is certain to be between the current numeral and the next numeral. Then, the system determines whether the current row contains less than a predetermined number of pixels (e.g., 5 pixels) whose gray levels are above the sensitivity threshold. If so, the system decreases the gray level sensitivity threshold by a predetermined decrement and repeats this process until either the number of pixels whose gray levels are above the sensitivity threshold exceeds the predetermined number or a minimum threshold value is reached. The system then searches for the top pixel row of the next numeral by looking for the next row having more than another predetermined number of pixels (e.g., 10 pixels) whose gray levels are above the current threshold. The system then searches for the next row having less than the predetermined number of pixels (e.g., 10 pixels) whose gray levels are above the threshold. This second pixel row thus identified is the bottom of the numeral. The image between the top and bottom rows thus identified is then extracted as the next character image to be read. In extracting the image, any pixel whose gray level is below the sensitivity threshold is considered to be a part of the black background and not part of a character stroke. In effect, those pixels having non-zero gray levels below the threshold are discarded as noise.
Threshold Correction During Character Reading; Each extracted character image is read by a feature- based character recognition device, such as a feature- based vector distance recognition device. The system also computes a confidence value proportional to the difference between the distances from the vector representing the extracted character image and the two closest reference vectors representing known symbols. The microfilm frame number OCR system of the present invention determines whether the confidence level is above a minimum value. If the confidence value is not above the minimum level, the system increases the gray level sensitivity threshold by a predetermined increment and repeats the foregoing extraction process pixel row by pixel row. The cycle is repeated until a satisfactory confidence level is attained.
Gray Level to Bi-Tonal Image Conversion;
In the preferred embodiment of the invention, the feature-based character recognition device is of the type disclosed in U.S. Patent Application Serial No. filed by Hin Leong Tan entitled "A FEATURE-
BASED AND TEMPLATE MATCHING OPTICAL CHARACTER RECOGNITION SYSTEM" and assigned to the assignee of the present application. Such a device requires that the gray level image pixels be converted to bi-tonal pixels. In the invention, this conversion is carried out by considering all pixels whose gray levels are above the sensitivity threshold to be "ON" and all pixels whose gray levels are at or below the sensitivity threshold to be "OFF".
Essentially, during character separation the invention predicts the lowest optimum gray level sensitivity threshold for reading the next character, in response to noise in the empty portions of the image.
During the reading of the next character, the invention increases the threshold from the predicted value to the minimum extent necessary, in response to a computed confidence value.
BRIEF DESCRIPTION OF THE DRAWINGS Preferred embodiments of the invention are described below in detail by reference to the accompanying drawings, of which:
Fig. 1 is a pictorial diagram illustrating the operation of the invention in one example;
Fig. 2 is a block diagram of a system embodying the invention;
Fig. 3 is a flow diagram illustrating the operation of the system of Fig. 2;
Fig. 4 is a graph depicting fluctuation of the gray level sensitivity under control by the system of Fig. 2 in the example depicted in Fig. 1.
MODES FOR CARRYING OUT THE INVENTION Referring to Fig. 1, a microfilm strip 100 has a fiducial frame mark 102 which locates an imaginary bounding box containing vertically aligned numerals 106, 108, 110 defining the microfilm frame number. The invention separates and extracts the image of each individual numeral 106, 108, 110 by processing one by one each horizontal row 112 of image pixels.
A system embodying the invention is illustrated in Fig. 2. A microfilm scanner 200 scans the microfilm strip 100 of Fig. 1 to generate an array of horizontal rows and vertical columns of image pixels, each pixel having a gray level value within a predetermined range. For example, if each pixel is an eight bit byte, then its gray level may be any one of 256 possible values. A gray level image memory 202 stores the digital image generated by the scanner 200.
A comparator 204 converts each pixel to one of two binary values ("ON" and "OFF") depending upon whether the pixel's gray level is above or below a sensitivity threshold stored in a buffer 206, and stores the results in a bi-tonal image memory 208. A feature-based recognition processor 210 performs feature-based optical character recognition on the bi-tonal image stored in the memory 208 and identifies the image with a known symbol while simultaneously computing a confidence value associated with the identification. A control processor 212 responds to the contents of the bi-tonal image memory during character separation or extraction to correct the gray level sensitivity threshold stored in the buffer 206. The control processor 212 responds to the confidence value generated by the recognition processor 210 during the recognition process performed by the processor 210 to correct the sensitivity threshold stored in the buffer 206.
In the preferred embodiment, the recognition processor 210 implements the system described in U.S. Patent Application Serial No. filed by Hin Leong Tan entitled "A FEATURE-BASED AND TEMPLATE MATCHING OPTICAL CHARACTER RECOGNITION SYSTEM" assigned to the assignee of the present application. However, while the referenced patent application teaches aligning each character image into one corner of the image frame, in the present invention it is preferred to center each character in the image frame.
Operation of the system of Fig. 2 will now be described by reference to the flow diagram of Fig. 3.
When the system of Fig. 2 is first initialized, the gray level sensitivity threshold value in the buffer 206 is initialized at an intermediate level between the two extremes of the gray level range of the scanner 200 (block 300 of Fig. 3). For example, if the gray level scale of the scanner is 0 to 255 from black to white, the initial level is set to 175. On the other hand, if the gray level scale is 255 to 0 from black to white, the initial threshold level is set to 125. In the preferred embodiment of the invention, this initial threshold level is also the minimum level below which the system will not permit the threshold to be decreased.
The control processor 212 sets a start pointer to the top pixel row 112 of the bounding box 104 of Fig. 1. This corresponds to top row address of the memory 202. The processor 212 then fetches from the gray level image memory the horizontal pixel rows between the rows 112a and 112b bounding the top and bottom of the first numeral 106 (block 304 of Fig. 3) . Each pixel thus fetched is compared by the comparator with the sensitivity threshold (206) and the binary result (either "ON" or "OFF") is stored in a corresponding location in the bi tonal image memory. Provided that there is a character below the current row address (FALSE branch of block 306 of Fig. 3), the top and bottom bounding rows 112a and 112b are found in the step of block 304 by searching down from the first (top) row in the image for the first and last horizontal pixel rows, respectively, whose corresponding bi-tonal image in the memory 208 contains more than a predetermined number of "ON" pixels (preferably, 10 pixels) .
The recognition processor 210 then generates a vector from the bi-tonal image data stored in the memory 208 and computes the distances between this vector and the two closest reference vectors stored in a reference vector memory 214 (block 308 of Fig. 3) . The recognition processor 210 transmits the ratio of the two distances as a confidence value to the control processor 212, which compares it with a predetermined acceptance level (block 310 of Fig. 3) . If the confidence value is below the predetermined acceptance level (FALSE branch of block
310) , the control processor 212 increases the contents of the sensitivity threshold buffer 206 by a predetermined increment (block 312 of Fig.) and empties the contents of the bi-tonal memory 208. The system then returns to the step of block 304 and repeats the succeeding steps, so that the gray level image in the memory 202 is again converted to a bi-tonal image accumulated in the memory 208, but with a higher gray level threshold value being used by the comparator 204.
Eventually, an acceptable confidence level is obtained (TRUE branch of block 310 of Fig. 3) so that the recognition processor 210 identifies the current character image with the symbol corresponding to the closest reference vector. In block 314 of Fig. 3, the control processor 212 sets its start pointer four rows below the bottom bounding horizontal row of the previous character found in the step of block 304. If the previous character was the "6" numeral 106 of Fig. 1, then the new start pointer location is four horizontal pixel rows below the row 112b. This guarantees that the pointer is now between adjacent characters where there ideally are no "ON" pixels. Then the control processor
212 counts the number of "ON" pixels in this row. If the number is less than a predetermined number (preferably, 5) , this means that the current gray level threshold is more than sufficient to handle local background noise level (YES branch of block 316 of Fig. 3). In this case the control processor 212 decreases the contents of the gray level threshold buffer 206 by a predetermined decrement (block 318 of Fig. 3) .
The decrementing process of blocks 316 and 318 is repeated in the current pixel row until either the number of "ON" pixels is above the predetermined number (NO branch of block 316) or the threshold has been decremented to the minimum level (YES branch of block 317) . At this point the operation returns to the step of block 304 of Fig. 3.
In one embodiment of the invention, in the step of block 312 of Fig. 3, the control processor 212 keeps count of how many times it has incremented the gray level threshold in response to a poor confidence value. If the count ever exceeds a certain number, the control processor declares a failure and causes the system to begin reading the next character.
Fig. 4 illustrates how the contents of the gray level threshold buffer 206 fluctuates in the presence of noise in the image, such as a "blob" 114 shown in Fig. 1. If the blob 114 is due to dirt, for example, the pixels in the gray level image stored in the memory 202 which represent the blob tend to have gray levels less than that of the numerals 106, 108, 110. In attempting to read the "1" numeral 108, the recognition processor 210 continues to produce low confidence scores during successive attempts. With each attempt, the control processor 212 raises the gray level threshold until the gray level of the blob 114 is exceeded, and it disappears in the bi-tonal image. This corresponds to the portion of the graph of Fig. 4 in which the threshold value increases over several iterations. At this point the "1" numeral is finally recognized. Then, as the system monitors a selected row in the space in the image between the "1" numeral 108 and the "9" numeral 110, the control processor 212 discovers that the threshold level is well above that required to maintain the count of "ON" pixels in the selected row. Thus, over succeeding iterations, the threshold is decremented back to a minimum level. This corresponds to the portion of the graph of Fig. 4 in which the threshold value decreases over several iterations.
While the invention has been described as processing a document image in which successive characters are aligned vertically above one another by segmented the characters by horizontal rows, the invention is equally useful with a document image in which the characters are aligned horizontally and lie vertically upright as in normal text. In the latter case, the invention segments the characters by vertical rows, and pixel columns may be considered to extend horizontally. A computer program in C-language which implements the invention is attached hereto as Appendix A. In this program, the gray level scale is 255 to zero from black to white and the minimum gray level threshold is 125.
While the invention has been described in detail by specific reference to preferred embodiments thereof, it is understood that variations and modifications may be made without departing from the true spirit and scope of the invention.
APPENDIX A
Copyright 1990 by Eastman Kodak Company
OCR on microfilm document references. 8-bit grey-level images
#include ti e.h #include stdio.h #include math.h
#define MSIZE 1024 /* # rows in input image */ #define NSIZE 512 /* # columns in input image
*/
#define BEGINJROW 51 /* first possible row of horizontal edge */ #define LAST_R0W 512 /* last possible row for horizontal edge */ #define BEGIN_COL 21 /* first possible column for vertical edge */ fdefine LAST_COL NSIZE-21 /* last possible column for vertical edge */ #define N_START 10 /* 1st of 2 columns for hor edge detection */ #define N_SPACE 40 /* coulmn spacing for hor edge detection */ #define N_ERR_OFF 25 /* offset column position for hor edge */ #define M_START_OFF 10 /* for 1st of 2 rows for vert edge detection */ #define M_SPACE 40 /* row spacing for vert edge detection */ #define M_ERR_OFF 25 /* offset row position for vertical edge */ #define DIST_TOLERANCE 3 /* allowance for difference in edge location */ fdefine TEXT_OFF_N 30 /* number of columns right to beginning of text */ #define TEXT OFF M 40 /* num of rows above to 1S beginning of text */
#define NIL_PIXEL_THRES 5 /* upper bound for # of pixels in non-text line */
#define TEXT_PIXEL_THRES 10 /* lower bound for # of pixels in text line */
#define MAX_NIL_ROWS 80 /* no more text after this number of blank lines */
#define CHAR_LENGTH 112 /* width of character in # of pixels */ #define MIN_DOT_ROWS 20 /* min # of rows to be considered a dot */ fdefine MAX_D0T_R0WS 35 /* max # of rows to be considered a dot */
#define MIN_D0T_N 60 /* min # cols right of y_edge for dot eg*/
#define MAX_D0T_N 100 /* max # cols right of v_edge for dot eg*/ fdefine MIN_CHAR_ROWS 45 /* min # rows to be considered a char */ #define MAX_CHAR_ROWS 90 /* max # rows to be considered a char */
unsigned int im[MSIZE] [NSIZE] ; float mean_column[MSIZE], num_pixels[MSIZE] ; int block_ref[128][128], vector[36], char_refs[ll] [36] ; int mask[200], corner[2] , thres, thres_index, thres_array[3]; int start_search;
main()
{ int i,j;
/* intitalize the 3 levels of thresholds for text segmentation */ thres__array[0] = 125; thres_array[1] = 70; thres_array[2] = 40;
read_char_refs() ;
read_block_ref() ;
for(;;){
read_image() ; printf("\nlmage read") ;
edge() ;
read_text() ;
}
}
read_text()
/*******************************************************
Segment and read the text portion of the image *******************************************************/
{ int i, index, begin_row, end_row; char text[50] ;
/* initialize and set pointer to top row of text region */ i=θ; start_search = 1; begin_row = corner[0] - TEXT_0FF_M;
/* repeat until no more characters are found */ do { index = read_char(begin_row, &end_row) ; start search = 0; /*printf("\nbegin_row= %d end_row= %d index= %d", begin_row,end_row,index) ;
*/
/* convert number to ascii form */ text[i] = •o1 + index; if(index == 10) text[i] = •/* if(index == 11) text[i] = ••• if(index == 91) text[i] = •?• ++i; begin_row = end_row;
} while (index < 90) ;
/* add terminating character and output text */ text[i] = '\0*; printf("\nREFERENCE NUMBER : %s", text);
}
read_char(begin_row, p_end_row ) int begin__row, *p_end_row;
/******************************************************* **
Read the next character beginning from begin_row. After it is found, the pointer to the last row of the character is placed in end_row ********************************************************
*/ { int num_rows, eg[2], begin, total, char_index; double dist[ll], confid;
/♦store the initial row*/ begin = begin_row ;
/*if possible, reset to higher threshold */ if( thres_index >= 1) { thres = thres_array[—thres_index] ; while(thres_index >= 0 && line_pixel_count(begin+4) < TEXT_PIXEL_THRES) { printf("\nReset higher threshold = %d", thres_array[thres_index]) ; thres = thres_array[—thres_index] ;
} thres = thres_array[++thres_index] ;
>
while( thres_index < 3) {
/* reset starting row */ begin_row = begin + 4;
/* find the next char of at least MIN_DOT_ROWS */ do{ num_rows = find_char(begin_row, p_end_row) ; begin_row = *p_end_row; total = *p_end_row - begin + 4;
} while( num_rows 1= 0 && num_rows < MIN_DOT_ROWS && total <= MAX_NIL_ROWS ) ;
/* check last char conditions */ if( (num_rows == 0) | | (num_rows < MIN_D0T_R0WS && total > MAX_NIL_R0WS) ) return (99) ;
/* check if char is a dot */ if( num_rows < MAX_DOT_ROWS ){ find_cg(p_end_row, num_rows, eg) ; if( cg[l] > corner[1]+MIN_D0T_N && cg[l] <= corner[1]+MAX_DOT_N) return (11) ;
} /* identify text char */ if( num_rows > MIN_CHAR_ROWS && num_rows <= MAX_CHAR_ROWS) { find_cg(p_end_row, num__rows, eg) ; pixel_count(cg, num_rows) ; compute_dist(dist) ; char_index = min_dist(dist, Sconfid) ; if(confid < 0.75) return ( char_index ) ; else printf("\nConfidence value %d, trying again",confid) ; }
/* Reset threshold and try again */ thres = thres_array[++thres_index] ; if( thres_index < 3) printf("\nReset threshold (in read_char) : %d",thres) ; } printf("\nCharacter not found") ; return(91) ;
}
min_dist(dist,p_confid) double dist[] , *p_confid;
/*******************************************************
***
Find the minumum distance value in dist[] ******************************************************** **/
{ int nl,n2,i;
/* find min and min-1 dist */ nl=n2=0; for (i=0; i < 11; ++i) { nl = ( dist[i] <= dist[nl])? i: nl; n2 = ( dist[i] >= dist[n2])? i: n2;
} for (i=0; i < 11; ++i) n2 = ( (dist[i] <= dist[n2]) && (i 1= nl))? i: n2;
for (i=0; i < 11; ++i) { printf("\n(%2d) %f", i,dist[i]); if( i == nl) printf(" < 1") ; if( i == n2) printf(" < 2") ;
} printf("\nmin ratio = %f", (dist[nl]/dist[n2]) );
*p_confid = dist[nl]/dist[n2] ;
return(nl) ; )
compute__dist(dist) double dist[] ;
/******************************************************* ***************
Compute the euclidean distance between vector[] and each of the 11 reference vectors char_ref[][] . output placed in dist[] ********************************************************
***************/ { int i,j; double sqrt() , sum, t;
/* do for each reference vector */ for(i=0; i < 11? ++i) {
sum =0; /* compute euclidean distance for(j=0; j < 36 ; ++j ) { t = vector[j ] - char_refs [i] [j ] ; sum += (t*t) ;
} sum /= 36.0 ; dist[i] = sqrt(sum) ; */
/* compute absolute distance */ for(j=0; j < 36; ++j){ t = vector[j] - char_refs[i] [j] ; if (t > 0) sum += t; else sum -= t;
} dist[i] = sum/36.0 ;
read_char_refs() /******************************************************* *****
Read in the block pixel count for each of the 11 reference characters. ******************************************************** ****/
{
int i,j; char fl[80], buffer[100] ;
FILE *fp, *fopen() ;
/* Read info from each of the 11 ref files */ *- - for (i=0 ; i < 11 ; ++i) { printf("\nenter ref file for char # %d : ", i) ; scanf("%s", fl) ; printf(»%s",fl) ; fp - fopen(fl, "r") ;
/* look for XX text seperator */ do fscanf(fp,"%s%",buffer) ; while ( (buffer[0] != 'X') || (buffer[l] != 'X') );
for (j=0; j < 36; ++j) fscanf(fp,"%d",&char_refs[i] [j]) ; fclose(fp) ; } }
read_block_ref() /******************************************************* ***
Read in the block assigment for each pixel in the lattice
******************************************************** ***/ { int i,j, block__size; char fl[100] ;
FILE *fp, *fopen();
printf("\nenter filename for block refs : ") ; scanf("%s",fl); printf("%s",fl) ; fp = fopen(fl, "r") ; fscanf(fp,"%d",&block_size) ; for(i=0; i < 128; ++i) for(j=0; j < 128; ++j) fscanf(fp,"%d",&block_ref[i][j3) ; return (block_size) ;
}
find_cg(p_end_row, numrows, eg) int *p_end_row, numrows, cg[]; /***************************************************** finds eg of char within the specified rows ******************************************************/ { int m,n; float pixel_sum,msum,nsum;
msum = nsu = pixel_sum = 0; for(m = (*p_end_row - numrows +1) ; m <= *p_end_row ; ++m){ pixel_sum += num_pixels[m] ; msum += ( num_pixels[m] * m ) ; nsum += ( num_pixels[m] * mean_column[m] ) ; }
/* prevent division by zero */ pixel_sum = (pixel_sum == 0)? 1 : pixel_sum; cg[0] = (0.5 + msum/pixel_εum) ; cg[l] = (0.5 + nsum/pixel_sum) ; }
pixel_count(eg, num_rows) int eg[] , num_rows; /******************************************************* ****
Counts the number of character pixels in each block of a partitioned text region with referenced from the eg. ******************************************************** ****/
{ int i , j ;
/* initialize */ for(i=0; i < 36; ++i) vector[i] = 0;
for(i = cg[0]-(num_rows/2) ; i < cg[0] + (num_rows/2) ; ++i) for(j = cg[l] - (CHAR_LENGTH/2) ; j < cg[l] + (CHAR_LENGTH/2) ; ++j) if( ( im[i][j] <= thres ) && block_ref[i - cg[0] + 64] [j - cg[l] + 64] != 999 )
++vector[block_ref[i - cg[0] + 64] [j - cg[l] +64]];
}
find_char(begin_row, p_end_row) int begin_row, *p_end_row;
/******************************************************* **
Find the next char srarting from begin_row. The number of rows of the character found will be returned. ******************************************************** **/
{ int end__text, cl, c2, text_start, i, j;
/* look for the first row of the char */ i = begin_row; end_text = 0; while ( (line_pixel_count(i) < TEXT_PIXEL_THRES) && end_text != 1 ) { ++i;
/* no more text if too many blank rows are detected
*/ if ( i >= MSIZE I I
( start_search != 1 && (i - begin_row) > MAX_NIL_ROWS ) ) end_text = 1; }
if(end__text == 1) { return (0) ;
}
else{
/* begin counting the number of char rows */ text_start = i; cl = c2 = 0; while( (i - text_start) <= (MAX_CHAR_ROWS - 4) ){ c2 = cl; cl = line_pixel__count(i) ; ++i; if( Cl < NIL_PIXEL_THRES && c2 < NIL_PIXEL_THRES ) break; }
/* offset the number of detected char rows by 2, above and below the char text region */ line_pixel_count(i) ;
*p_end_row = i; j = i - text_start + 2; j = ( (j % 2) == 0)? j : j-l; return(j) ; } }
1ine_pixel_count(row) int row; /*******************************************************
Count the number of char pixels in the given row *******************************************************/ int sum, j; float sum col;
sum_col = sum = 0; for( j = corne [1] + TEXT_OFF_N ; j < corner[1] + TEXT_OFF_N + CHAR_LENGTH; ++j) { if( im[row][j] <= thres ){ ++sum; sum_col += j ;
} )
num_pixels[row] = sum;
/* compute mean & prevent divide by zero */ sum = (sum == 0)? 1 : sum; mean_column[row] = sum_col/ sum ;
return(sum) ; }
read_image() /************************************************ Reads in the input image
***********************************************/
{ int i,j; char fl[80], buf[NSIZE]; FILE *fp, *fopen();
printf("\nenter input file : ") ; scanf("%s", f1) ; printf("%s", fl) ;
fp = fopen(fl,"r") ; if(fp == 0) { printf("\nno input file") ; exit(O) ; )
/* read in & invert image */ for (i=0; i<MSIZE; ++i) ( fread(buf, 1, NSIZE, fp) ; for(j=0; j< NSIZE; ++j){ im[MSIZE-l-i] [NSIZE-1-j] = buf[j] ; }
} }
edge() { int i, j, ml, nl;
init_mask() ;
thres_index = 0; thres = thres_array[thres_index] ;
ml = hor__edge(0,20) ; if(ml == 0) { printf("\nTrying offset column position for horizontal edge") ; ml = hor_edge(N_ERR_OFF, 40); if(ml == 0){ ++thres_index; thres = thres_array[thres_index] ; printf("\nReset threshold = %d",thres); ml = hor_edge(0,40); if(ml == 0){ printf("\nTrying offset column position for horizontal edge") ; ml = hor_edge(N_ERR_OFF, 40) ; if(ml == 0) { printf("\nTrying mask length 100"); ml = hor_edge(0,100) ; if(ml == 0) { printf("\nTrying offset column with mask length 100") ; ml = hor_edge(N_ERR__OFF, 100); if(ml == 0) { printf("\nError : horizontal edge not found") ; exit(0);
} } ) } }
}
nl = ver_edge(ml, 0, 20) ; if( nl == 0) { printf("\nTrying row offset for vertical edge") ; nl = ver_edge(ml,M_ERR_OFF,40); if(nl == 0) { printf("\nError: Vert edge not found") ; exit(0) ; )
} corner[0] = ml; corner[1] = nl; printf("\nEdge co-ord (%d,%d)",ml,nl) ; }
init_mask()
{ int i,j ;
for(i=0; i < 100 ; ++i) mask[i] = 0; for ( i= 100 ; i < 200 ; ++i) mask[i] = 1 ; }
hor_edge(h_off,mask_length) int h_off, mask_length;
{ int i,max_m[2] ; int n, mean;
for(n = NJSTART + h_off ; n < N_START + h_off + 2*N_SPACE; n += N_SPACE) { i = (n - N_START - h_off) / N_SPACE; max_m[i] = v_conv(n,mask_length) ; }
/* Check consistency of detected edge points */
mean = check_dist(max_m,DIST_TOLERANCE) ;
return(mean) ;
ver_edge(ml,v_off, ask_length) int ml,v_off,mask_length;
{ int i,j, min_n[2]; int m,n, mean;
for(m = ml + M_START_OFF + v_off; m < ml + M_START_OFF + v_off+(2*M_SPACE) ; m += M_SPACE ) { i x- (m-ml-M_START_OFF-V_θff)/ M_SPACE; min_n[i] = h_conv(m,mask_length) ; } /* Check consistency of detected edge points */
mean = check_dist(min_n,DIST_TOLERANCE) ;
return(mean) ; )
h_conv (m, mask Length) int m,mask_length;
{ int min_val, val, min_val_col, n, i, binary_im ;
min_val = 0; for(n= BEGIN_C0L ; n < LAST_C0L ; ++n) ( val= 0; for(i=0; i< mask_length ; ++i) { binary_im = ( im[m] [n- (mask_length/2) +i] <= thres)? 0 : 1 ; val += ( mask[100-(mask_length/2)+i] ==0)? binary_im : -binary_im ;
} if(val < min_val) ( min_val = val; min_val_col = n;
}
} return(min_val_col) ;
}
v_conv(n,mask_length) int n, mask_length;
{ int m, max_val, val, max_val_row, i, binary_im; max_yal = 0 ; for(m= BEGIN_R0W; m < LAST_R0W ; ++m) { val= 0; for(i=0; i < mask_length ; ++i) { binary_im = ( im[m-(mask_length/2)+i] [n] <= thres
)? 0 : 1 ; val += ( mas [100-(mask_length/2)+i] ==0)? binary_im : -binary_im ;
} if(val > max_val) { max_val = val; max_val_row = m;
} return(max_val_row) ;
}
check_dist(list,tolerance) int list[], tolerance; { int dist;
/*printf("\ndetected points: %d %d",list[0], list[l]) ;*/ dist = list[0] - list[l]; dist = (dist > 0)? dist : -dist; if( dist <= tolerance ) return( (list[0]+list[l])/2 ); else return(0) ; }

Claims

What is claimed is:
. 1. An optical character recognition system which recognizes an individual character in a thresholded image generated from a gray level document image and computes a corresponding confidence score, said document image comprising an array of columns and rows of pixels and characterized by a succession of characters and spaces separating adjacent ones of said characters, said system comprising: thresholding means for discounting those pixels in said document image having a gray level below a certain threshold, whereby to generate said thresholded image from said document image; gray level threshold prediction means for determining a minimum value of said threshold at which the number of pixels not discounted by said thresholding means in a portion of one of said spaces is less than a background noise level and for reducing said threshold to said minimum value; and gray level threshold correction means responsive to the confidence score corresponding to the thresholded image of one of said characters for increasing said threshold from said minimum value to a new threshold whenever the corresponding confidence score is below a predetermined score, whereby said thresholding means generates another thresholded image of said one character corresponding to a higher confidence score.
2. The system of Claim 1 wherein said characters are arranged in a linear succession and wherein said gray level threshold prediction means comprises: noise level detection means for determining in pixel rows in said image the count of pixels not discarded by said thresholding means, whereby said portion of one of said spaces comprises one of said rows; sensitivity enhancement means responsive upon said noise level detection means determining said count for one of said rows for decreasing said threshold by a predetermined decrement whenever said count is less than a first number; and character separation means responsive upon said noise level detection means generating a count above a second number for defining the corresponding one of said rows as the top bounding row of the next one of said characters.
3. The system of Claim 2 wherein said character separation means further comprises means responsive following the defining of said top bounding row and upon said noise level detection means generating a count below a third number for defining the corresponding row as the bottom bounding row of said next one character.
4. The system of Claim 3 further comprising feature-based character recognition means for identifying the portion of said thresholded image between said top and bottom character bounding rows with a known symbol and a corresponding confidence score.
5. The system of Claim 4 wherein said recognition means comprises: means for computing the distances between a vector representing said image portion and the two closest ones of a set of reference vectors representing a set of known symbols; means for identifying the symbol corresponding to the closest one of said reference vectors; and means for computing said confidence score as the ratio between said distances.
6. The system of Claim 3 wherein said first number is 5 pixels and said second and third numbers are both 10 pixels.
7. The system of Claim 1 wherein said thresholding means comprises means for converting all discarded pixels to an "OFF" value and all other pixels to an "ON" value whereby said thresholded image comprises a bi-tonal image.
8. In an optical character recognition system which recognizes an individual character in a thresholded image generated from a gray level document image and computes a corresponding confidence score, said document image comprising an array of columns and rows of pixels and characterized by a succession of characters and spaces separating adjacent ones of said characters, a method for operating said system, comprising: discounting those pixels in said document image having a gray level below a certain threshold, whereby to generate said thresholded image from said document image; determining a minimum value of said threshold at which the density of pixels not discounted by said discounting step in a portion of one of said spaces is less than a background noise level; reducing said threshold to said minimum value; and increasing said threshold from said minimum value to a new threshold whenever the thresholded image of one of said characters is read with a corresponding confidence score below a predetermined score.
9. The method of Claim 8 further comprising: generating another thresholded image of said one character and repeating said increasing step in a cycle which continues until said character is read with a confidence score not below said predetermined score.
10. The method of Claim 8 wherein said characters are arranged in a linear succession and wherein said determining step comprises: determining in individual pixel rows in said image the count of pixels not discounted by said discounting step, whereby said portion of one of said spaces comprises one of said rows; decreasing said threshold by a predetermined decrement if said count in a selected one of said rows is less than a first number; and if said count is above a second number, defining the corresponding one of said rows as the top bounding row of the next one of said characters.
11. The method of Claim 10 further comprising a step following the defining of said top bounding row, and upon said count being below a third number, of defining the corresponding row as the bottom bounding row of said next one character.
12. The method of Claim 11 further comprising identifying the portion of said thresholded image between said top and bottom character bounding rows with a known symbol and a corresponding confidence score.
13. The method of Claim 12 wherein said identifying step comprises: computing the distances between a vector representing said image portion and the two closest ones of a set of reference vectors representing a set of known symbols; identifying the symbol corresponding to the closest one of said reference vectors; and computing said confidence score as the ratio between said distances.
14. The method of Claim 11 wherein said first number is 5 pixels and said second and third numbers are both 10 pixels.
15. The method of Claim 8 further comprising converting all discarded pixels to an "OFF" value and all other pixels to an "ON" value whereby said thresholded image comprises a bi-tonal image.
PCT/US1991/003064 1990-05-08 1991-05-03 Row-by-row segmentation and thresholding for optical character recognition WO1991017519A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US07/520,435 US5081690A (en) 1990-05-08 1990-05-08 Row-by-row segmentation and thresholding for optical character recognition
US520,435 1990-05-08

Publications (1)

Publication Number Publication Date
WO1991017519A1 true WO1991017519A1 (en) 1991-11-14

Family

ID=24072592

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1991/003064 WO1991017519A1 (en) 1990-05-08 1991-05-03 Row-by-row segmentation and thresholding for optical character recognition

Country Status (4)

Country Link
US (1) US5081690A (en)
EP (1) EP0482187A1 (en)
JP (1) JPH05500129A (en)
WO (1) WO1991017519A1 (en)

Families Citing this family (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5268773A (en) * 1990-03-30 1993-12-07 Samsung Electronics Co., Ltd. Document image signal processor having an adaptive threshold
JP3422541B2 (en) * 1992-12-17 2003-06-30 ゼロックス・コーポレーション Keyword modeling method and non-keyword HMM providing method
US5438630A (en) * 1992-12-17 1995-08-01 Xerox Corporation Word spotting in bitmap images using word bounding boxes and hidden Markov models
JP3272842B2 (en) * 1992-12-17 2002-04-08 ゼロックス・コーポレーション Processor-based decision method
US5757516A (en) * 1993-01-11 1998-05-26 Canon Inc. Noise quenching method and apparatus for a colour display system
US5455872A (en) * 1993-04-26 1995-10-03 International Business Machines Corporation System and method for enhanced character recogngition accuracy by adaptive probability weighting
US5454049A (en) * 1993-06-21 1995-09-26 Sony Electronics, Inc. Automatic threshold function for machine vision
DE69519323T2 (en) * 1994-04-15 2001-04-12 Canon Kk System for page segmentation and character recognition
DE69600461T2 (en) 1995-01-17 1999-03-11 Eastman Kodak Co System and method for evaluating the illustration of a form
DK0807297T3 (en) * 1995-01-31 2000-04-10 United Parcel Service Inc Method and apparatus for separating foreground from background in images containing text
US6266445B1 (en) 1998-03-13 2001-07-24 Canon Kabushiki Kaisha Classification-driven thresholding of a normalized grayscale image
JP2000132122A (en) * 1998-10-28 2000-05-12 Fuji Photo Film Co Ltd Scroll-type display capable of continuous display
US7120308B2 (en) * 2001-11-26 2006-10-10 Seiko Epson Corporation Iterated de-noising for image recovery
US7260269B2 (en) * 2002-08-28 2007-08-21 Seiko Epson Corporation Image recovery using thresholding and direct linear solvers
FR2851357B1 (en) * 2003-02-19 2005-04-22 Solystic METHOD FOR THE OPTICAL RECOGNITION OF POSTAL SENDS USING MULTIPLE IMAGES
US7352909B2 (en) * 2003-06-02 2008-04-01 Seiko Epson Corporation Weighted overcomplete de-noising
US20050076301A1 (en) * 2003-10-01 2005-04-07 Weinthal Tevya A. Apparatus, system, and method for managing fitness data
US20050105817A1 (en) * 2003-11-17 2005-05-19 Guleryuz Onur G. Inter and intra band prediction of singularity coefficients using estimates based on nonlinear approximants
US20080310721A1 (en) * 2007-06-14 2008-12-18 John Jinhwan Yang Method And Apparatus For Recognizing Characters In A Document Image
US8331680B2 (en) * 2008-06-23 2012-12-11 International Business Machines Corporation Method of gray-level optical segmentation and isolation using incremental connected components
RU2640331C2 (en) * 2015-12-11 2017-12-27 Частное образовательное учреждение высшего образования "ЮЖНЫЙ УНИВЕРСИТЕТ (ИУБиП)" Method of identifying extended objects of earth surface

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4087790A (en) * 1977-08-22 1978-05-02 Recognition Equipment Incorporated Character presence processor
EP0163377A1 (en) * 1984-04-10 1985-12-04 BRITISH TELECOMMUNICATIONS public limited company Pattern recognition system

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3263216A (en) * 1964-03-20 1966-07-26 Ibm Pattern recognition error correction system employing variable parameter input devices
US3868637A (en) * 1971-09-30 1975-02-25 Michael S Schiller Document retrieval system
US4490852A (en) * 1981-11-17 1984-12-25 Ncr Corporation Image capturing apparatus
JPH0789363B2 (en) * 1983-05-25 1995-09-27 株式会社東芝 Character recognition device
US4593325A (en) * 1984-08-20 1986-06-03 The Mead Corporation Adaptive threshold document duplication
US4742556A (en) * 1985-09-16 1988-05-03 Davis Jr Ray E Character recognition method
US4731863A (en) * 1986-04-07 1988-03-15 Eastman Kodak Company Digital image processing method employing histogram peak detection
US4829587A (en) * 1987-03-02 1989-05-09 Digital Equipment Corporation Fast bitonal to gray scale image scaling
JPH02196565A (en) * 1989-01-25 1990-08-03 Eastman Kodatsuku Japan Kk Picture binarizing system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4087790A (en) * 1977-08-22 1978-05-02 Recognition Equipment Incorporated Character presence processor
EP0163377A1 (en) * 1984-04-10 1985-12-04 BRITISH TELECOMMUNICATIONS public limited company Pattern recognition system

Also Published As

Publication number Publication date
JPH05500129A (en) 1993-01-14
EP0482187A1 (en) 1992-04-29
US5081690A (en) 1992-01-14

Similar Documents

Publication Publication Date Title
WO1991017519A1 (en) Row-by-row segmentation and thresholding for optical character recognition
US5280544A (en) Optical character reading apparatus and method
US4757551A (en) Character recognition method and system capable of recognizing slant characters
US4903312A (en) Character recognition with variable subdivisions of a character region
US5410611A (en) Method for identifying word bounding boxes in text
CA2049758C (en) Optical character recognition system and method
EP0543592B1 (en) Optical word recognition by examination of word shape
EP0483391B1 (en) Automatic signature verification
US5640466A (en) Method of deriving wordshapes for subsequent comparison
EP0472313B1 (en) Image processing method and apparatus therefor
US4556985A (en) Pattern recognition apparatus
US5841905A (en) Business form image identification using projected profiles of graphical lines and text string lines
US5781658A (en) Method of thresholding document images
US6813367B1 (en) Method and apparatus for site selection for data embedding
EP0877335B1 (en) Character recognition method, character recognition apparatus
US6175664B1 (en) Optical character reader with tangent detection for detecting tilt of image data
JP2011257896A (en) Character recognition method and character recognition apparatus
JP2644041B2 (en) Character recognition device
JP2868134B2 (en) Image processing method and apparatus
JPH10222602A (en) Optical character reading device
US5754689A (en) Image processing method and apparatus
JPH0916715A (en) Character recognition system and method therefor
JP3127413B2 (en) Character recognition device
JP3084833B2 (en) Feature extraction device
JP3196755B2 (en) Character inclination detection correction method and apparatus

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): JP

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH DE DK ES FR GB GR IT LU NL SE

WWE Wipo information: entry into national phase

Ref document number: 1991910894

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 1991910894

Country of ref document: EP

WWR Wipo information: refused in national office

Ref document number: 1991910894

Country of ref document: EP

WWW Wipo information: withdrawn in national office

Ref document number: 1991910894

Country of ref document: EP