CA1317377C - Image recognition apparatus - Google Patents

Image recognition apparatus

Info

Publication number
CA1317377C
CA1317377C CA000595251A CA595251A CA1317377C CA 1317377 C CA1317377 C CA 1317377C CA 000595251 A CA000595251 A CA 000595251A CA 595251 A CA595251 A CA 595251A CA 1317377 C CA1317377 C CA 1317377C
Authority
CA
Canada
Prior art keywords
character
recognition
probability
image
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CA000595251A
Other languages
French (fr)
Inventor
Yohji Nakamura
Richard G. Casey
Kazuharu Toyokawa
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Application granted granted Critical
Publication of CA1317377C publication Critical patent/CA1317377C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/24Character recognition characterised by the processing or recognition method
    • G06V30/242Division of the character sequences into groups prior to recognition; Selection of dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/19173Classification techniques

Abstract

IMAGE RECOGNITION APPARATUS
Abstract In a document scanning device, a pattern or character recognition algorithm, employed to generate image data, differentiates between high and low probability data and enables modification of the recognition procedure for handing recognition errors.

Description

IMAGE RECOGNITION APPARATUS

This invention relates to a pattern recognition apparatus. More particularly, the invention relates to a pattern recognition apparatus which, during the recognition of patterns or images of characters, modifies recognition procedure due to erroneously recognized pattern or images of characters.

Prior Art and Problems Systems have ~een developed, which supply image of printed characters of document scanned by an image scanning device to personal computer to perform character recognition by character recognition algo-rithm. Since it is desired to recognize the characters at high speed in this system, character recognition trees of point sampling type wherein particular picture elements (pel) of character images are sequentially sampled, and plural nodes are sequentially selected depending upon whether the sampled pel is white or black. The character recognition tree is prepared for each type font. Since it is very difficult to modify the completed character recognition tree to recognize another font, a number of character recognition trees of the type fonts which are expected to be used fre-quently must be prepared. For example, character recognition trees of thirty kinds of fonts are stored in a memory. Accordingly, in the case that characters of a font which differs from the fonts of the stored trees in the memory are included in a document, the proba-bility of the recognition is remarkably decreased.

R. G. Casey and C. R. Jik, "a processor-based OCR
system", IBM Journal of Research Development, Volume 27, No. 4, July 1983, pp. 385 - 399 discloses a system which uses three trees as one set and combines the results from each tree.

F. M. Wahl, K. Y. Wong and R. G. Casey, "Segmentation and Text Extraction in Mixed Text/Image Documents", Computer Graphics and Image Processing, Volume 20, 1982, pp. 375 - 390, discloses a method for automatically discriminating text regions and non-text regions of printed document.

R. G. Casey and G. Nagy, "Recursive segmentation and classification of composite character patterns", the 6th International Conference on Pattern Recognition, October 1982, discloses the use of decision tree for effectively segmenting and recognizing characters.

R. G. Casey and G. Nagy, "Decision Tree Design Using a Probabilistic Model", IEEE Transaction on Information Theory, Volume IT30, No. 1, January 1984, pp. 93 - 99, and R. G. Casey, "Automatic generation of OCR logic 1~17377 from scanned characters'l, IBM Technical Disclosure Bulletin, Volume 22, No. 3, August 1979, page 1189 disclose a method for making a mathematic model of a decision three from a pro~ability of pels. K. Y. Wong, R. G. Casey and F. Wahl, "Document Analysis System:", IEEE 6th International Conference on Pattern Recogni-tion, October 1982, discloses an adaptive OCR reading a document by partitioning the document to a text area and a non-text area.

R. G. Casey, S. K. Chai and K. Y. Wong, "Unsupervised construction of decision networks for pattern classifi-cation", IEEE 7th International Conference on Pattern Recognition, August 1984, and US Patent 4499596 by R. G. Casey and T. D. Friedman, disclose method fo-increasing a speed of recognition by determining some pels of text character before comparing an entire pattern of the character.

Y. Mano, Y. Nakamura and K. Toyokawa, "Automatic font selection for character recognition:", IBM Technical Disclosure Bulletin, Volume 30, No. 3, August 1987, POP.
1112 - 1114, discloses method for determining which one 1317~77 of plural type fonts is used to print characters of a doeument.

Summary of Invention A doeument seanning deviee, under the eontrol of pattern or character recognition algorithm or charaeter reeogni-tion means, optically seans patterns or images of eharae-ters on a document, and generates image data of binary 1 or 0 representing the images. For simplifying deserip-tion, the image data of the binary 1 or 0 is ealled as image of eharaeter or pattern. The image of pattern or eharaeters of one page of doeument is stored in an image buffer under the eontrol of the eharaeter reeognition algorithm. And, respeetive pattern or image of character is segmented and stored in a working buffer. The charac-ter recognition algorithm recognizes the images stored in the working buffer and stores the results into a resultant buffer. The eharacter recognition algorithm stores charaeter eodes of characters recognized with high probability in the resultant buffer, and stores charaeter codes and images of eharacters recog~iæed with low eapability and flags indieating the low 1~173~7 probability in the resul tant buffer. And, the character recognition algorithm displays the contents of the resultant buffer on a display screen of a display device.
An operator viewing the displayed results of the recogni-tion inputs a correct character code for erroneously recognized character through a keyboard to correct the character code of the erroneously recognized characters in the resultant buffer to the correct character code.
And, the operations for modifying the trees based upon the results of recognition is started. The character recognition algorithm stores the image of the erro-neously recognized character and its correct character code in a learning buffer.

The character recognition algorithm sequentially scans each pel of the character image in the learning buffer by using a scanning window. The size of the window is 3 x 3 pels, for example. The eentral pel of the window is positioned at the pel of the character image to be processed. The character recognition algorithm aecesses an entry of a statistics table by using a bit pattern of neighboring pels of the central pel as an address.
In the accessed entry, a value representing a probability JA9~88-015 6 131~37~

of appearing black in the pel being processed is stored.
The character recognition algorithm fetches this value of probability and stores the value in a storage posi-tion of a probability map, which corresponds to the position of the central pel. The character recognition algorithm supplies the value of the probability map to the character recognition tree as an input, detects a leaf node generating high value of probability of recognition, fetching a character image of character code assigned to this leaf node, and forms the proba-bility map of the character image. Next, the character recognition algorithm compares the values of both the probability maps for each storage position, detects a storage position at which one of the values is high and the other of the values is low, replaces the character code originally assigned to the detected leaf node by the address of the detected storage position, extends two nodes from the leaf node, assigns the originally assigned character code to one of the extended nodes, and assigns the character code in the character image in the learning buffer to the other node. The charac-ter recognition algorithm supplies the character code~

in the resultant buffer.to the data processing device and stores the modified character recognition tree in the tree memory. Accordingly, the new tree is used in the subsequent character recognition operations.

~!3173~7 Description of Drawings;

Fig. 1 shows the block diagram of the character recognition apparatus of the present învention.

Figs. 2A, 2B, 2C, 2D and 2E are the flowcharts showing the recognition steps and the tree modifying steps of the character recognition method of the present invention.

Figs. 3 and 3A show the character images. Fig. 3B
shows the character recognition tree. Fig. 3C shows one set of trees stored in a memory space of the tree memory. Fig. 3D shows the operations for inputting the values of probability in the map buffer to the tree shown in the Fig. 3B. Fig. 3E shows the operations for extending the one leaf node of the tree.

Fig. 4 shows the operations for scanning the character image by using the window.

Fig. S shows the addresses of the window.

Figs. 6 and 7 show the two areas of the map buffer storing the values of probability of each pel position of the character images A and B. And, Fig. 8 shows a flowchart showing the operations of the nodes of the tree.

131~377 Referring to Figs. 1 and 2, the Fig. 1 shows circuit block diagram for modifying the trees by the learning function, in accordance with the present invention, and the Figs. 2A through 2E show flowchart representing the operations of the circuit block.

A document scanning device 1 is a well-known device which includes a document feed mechanism, a light source, lens, plural optical sense elements arranged in a line and threshold circuit. The number of the optical sense elements is 8 elements/mm, for example. That is, the pel density in the main scan direction is 200 pels/inch, and the pel density in the sub-scan direction perpen-dicular to the main scan direction is also 200 pels/inch, for example. One optical element generates an analog signal of one pel, and this analog signal is applied to the threshold circuit. Binary 1 signal representing ~317377 ~lac~ pel is generated when the analog signal is lower than a predetermined threshold value, and ~inary 0 sig-nal representing white pel is generated when the analog signal is higher than the threshold. These operations are performed by the blocks 21 and 22 in the Fig. 2A.
The document includes plural character rows, and one character row includes about 50 printed alphanumeric characters. The reading operations of the documents are performed for each row, and the character images of one paper document are stored in the image buffer 2.
This operations are performed by a block 23 in the Fig.
2A. A character segmenting logic circuit 3 is a well-known circuit which scans the image buffer to find out rectangle which contacts outer edges of each character in the first row, to segment each character in the first row. All the characters in one row are segmented and stored in frames of 9 x 11 pel of the working buffer 4 with the center of the character image being positioned at the center of the frame of 9 x 11 pel.
These operations are performed by the blocks 24 and 25 of the Fig. 2A. The control device 5, such as the microprocessor, controls the above operations.

JA9-88-015 lO

Before describing the next operations, type fonts of the characters and tree memory 6 are described. The characters of the document are printed b~ various type fonts. The type fonts are classified into 10 pitch group, 12 pitch group, letter gothic group, orator group and proportional group. And, the 10 pitch group includes the type fonts of Courier 10, Pica 10, Prestige Pica 10 and Titan 10, the 12 pitch group includes the type fonts of Courier 12, Elite 12, Prestige Elite 12 and OCR B12, the letter gothic group includes the type fonts of Orator and Presenter, and the proportional group includes the type fonts of Bold, Cubic/Triad, Roman and Title. The tree memory 6 stores the charac-ter recognition trees, the character images and the character codes for each type font. Referring to the Fig. 1, the tree memory 6 is divided into 18 memory spaces. One type font is assigned in one memory space.
In the Fig. 1, only four memory spaces 6A, 6B, 6C and 6D are shown. The memory space 6A stores one set of character recognition trees or three character recogni-tion trees 6E, 6F and 6G, all character images and all character codes of the type font "Courier 10". In the same manner, the memory spaces 6B, 6C and 6D store the JA9-88-015 ll 1317~77 character recognition trees, all character images and the all character codes of the Pica 10, Elite 12 and Orator~

After the operations of the block 25 in the Fig. 2A, the control circuit 5 starts the operations of the character recognition logic circuit 10. The circuit 10 in the block 26, fetches one set of the character recognition trees and all character codes from any of memory spaces. For example, it is assumed that the character recognition trees 6E, 6F and 6G and all character codes for the Courier 10 in the memory space 6A have ~een fetched. The operations proceeds to the block 27, and the character recognition logic circuit 10 accesses the first frame of the working buffer 4 to fetch the first character image of the first character row, recognizes the first character image by using the first character recognition tree 6E, and stores the results in an auxiliary buffer ~not shown). Next; the character recognition logic circuit 10 recognizes the first character image again by using the second charac-ter recognition tree 6F and stores the results in the auxiliary buffer. Next, the character recognition logic 1~17~77 circuit 10 recognizes the first character image again by using the third character recognition tree 6G, and stores the results in the auxiliary buffer. Next, the character recognition logic circuit determines the three results, stored in the auxiliary buf~er, and ~enerates the final result depending upon the three results, as below;

Case 1: stores the character code in the resultant buffer 7 when a degree of accuracy is higher than a predetermined value.

Case 2: stores the character code which is the result of the recognition and the image of this character in the resultant buffer and set flag to 1 r when the degree of accuracy is lower than the value. The flag 1 represents that this character is the rejected character.

The above recognition operations of the character image are described with referring to Figs. 3A, 3B and 3C.
It is assumed that the second and third characters "B"
and "M" are typed by the font of the Courier 10, and O

the first character "A~' is typed by a different type font. The images 31 - 33 of these characters have been stored in the three frames of the working buffer 4.

The Fig. 3B shows the simplified tree 6E among the three character recognition trees 6E - 6G for the Courier 10 in the memory space 6A shown in the Fig. 1.
The character recognition tree determines or discrimi-nate as to whether a pel at a predetermined pel position is white or black, selects the succeeding pels depending upon that the previous pel was black or white, and generates the final result. That is, the tree 6E
selects a pel address (4, 6) of the image 31 in the frame in the first root node 301, determines as to whether the pel at this address is black or white, and proceeds to the next branch node 302 or 303 depending upon the result of the node 301. In the exmeplary case, the answer of the root node 301 is white, the branch node 302 is selected at which a pel at an address (4, 0) is selected. And, this pel is deter-mined whether this pel is white or black. Since the answer is black, the next branch node 304 is selected and a pel at an address (8, S) is determined whether it is white or black. Since the answer is white, the operations proceed to a leaf node 305 and the results of recognition "B" is obtained. That is, although the first character 31 is A, the character recognition tree 6E has recognized this first character as B. If the type font of this first character were the type font of the Courier 10, as shown by the reference number 34 in the Fig. 3, the tree 6E traces from the node 301 to a node 307 through nodes 303 and 306, and generates the answer A. Since the actual results of the recognition is B, the character recognition logic circuit 10 stores the character code of the character B and the degree of accuracy of the recognition into the auxiliary buffer (not shown).

Next, the character recognition logic circuit 10 recognizes the first character A again by using the tree 6F, stores the results in the auxiliary buffer, recognizes the first character A again by using the tree 6G, stores the results in the auxiliary buffer, and generates the final result based upon the these three results. The recognition of the characte~s by using three kinds of trees is described in the first article hereinbefore.

It is assumed that the final result of the recognition of the first character is B. The character recognition logic circuit 10 stores the character code of the character B in the first character position in an area 7A of the resultant buffer 7. In the same manner as above described, the character recognition logic cir-cuit 10 fetches the second character image B from the working buffer 4, recognizes the image by using the three trees 6E, 6F and 6G to generate the final result and stores the final result into the second character position in the area 7A of the resultant buffer 7. In this manner, all the characters, e.g. fifty characters, in the first character row are recognized and the final results of all the characters are stored in the area 7A
of the resultant buffer 7. Although the character recognition tree 6E shown in the Fig. 3B is shown as a simplified form, the tree actually includes 10 - 11 decision steps. When the total nodes is 2999, the number of the leaf nodes is about 1500. It is assumed that although the fourth printed character in the first character row of the document is actually Q, the final result o~ the recognition is o, and its degree ~f accuracy is low and that although the 25th printed ~31~377 charaeter in the first character row of the document is actually P, the final result of the reeognition is R
and its degree of accuracy is low. In this ease, the eharaeter reeognition logie cireuit 10 sets flag 1, and stores the character code of the eharacter o and the actual image of the printed eharaeter Q in the fourth eharacter position in the area 7A of the resultant buffer 7; also sets flag 1 and stores the character eode of the eharaeter R and the aetual image of the printed eharaeter P in the 25th eharacter position in the area 7A of the resultant buffer 7, as shown in the Fig. 1. And, the character recognition logie eireuit 10 ealculates a reject ratio based upon the number of flags in the area 7A and stores it. In the eXemplary ease, since two flags 1 of the two characters among the total 50 eharacters are set, the reject ratio is 4% and this 4% is stored.

Next, in a bloek 28, it is determined whether all fonts, i.e. 18 kinds of fonts in the exemplary ease have been used, or not, in the recognition operations.
If the answer is NO, the operations return to tpe block 26 in which the memory space of the next font în the -~31~377 tree memory 6 is accessed, and the above described operations are performed. The block 28 generates the answer YES when the character code and the character image have been stored in the learning buffer 9 during the process of the character row.

When the recognition operations of the ~irst character row using all the kinds of recognition trees for the eighteen type fonts has been completed, the answer of the block 28 is YES, and the results of the recognition have been stored in the eighteen areas 7A, 7B, ... 7N of the resultant buffer 7. For simplifying the Fig. 1, only the reject ratio is shown in each area 7B, 7N.

Next, the operations proceed to a block 29 in the Fig.
2B. The character recognition logic circuit 10 deter-mines the reject ratio of each area 7A, ..., 7N of the resultant buffer 7, selects the type font of the lowest reject ratio, and uses the trees of the selected font in the subsequent processes. It is assumed that the reject ratio of the Courier 10 in the area 7A is minimum.
Next, the character recognition logic circuit lQ, in a block 30 of the Fig. 2B, determines as to whether the~

1~17377 reject ratio 4% of the Courier 10 is equal to 0%.
Since the answer is NO, the operations proceed to a block 31. If the answer of the block 30 is YES, the operations return to the block 24 of the Fig. 2A, and the operations for processing the next character row are started.

In the block 31, the character codes of the first character row stored in the area 7A of the resultant buffer 7 are supplied to a character generator circuit, not shown, the character images from the character generator circuit are displayed on the display area 8A
of the display screen 8, wherein the 4th and 25th charac-ters with the flag 1 being set are displayed by blink or highlight. And, all the character images of the first character row stored in the working buffer 4 are displayed on the display area 8B of the display screen 8. Accordingly, the operator could compare the actual character images of the document displayed on the display area 8B with the results of recognition displayed on the display area 8A of the display screen 8, and corrects the character code B of the first character in the are~ 7A of the resultant buffer 7 to . 19 ~ 3~7377 the character code A by positioning a cursor at the first character and inputting the character code of the character A from a keyboard, not shown; and corrects the character code O of the 4th character in the area 7A to the character code Q by positioning the cursor at the 4th character and inputting the character code Q;
and corrects the character code R of the 25th character in the area 7A to the character code P by positioning the cursor at the 25th character and inputting the character code P. That is, in this step, the corrected character codes of all the characters of the first character row have been stored în the resultant buffer 7. A block 32 in the Fig. 2B determines as to whether the input operations by the operator are performed, or not, and if the answer is YES, the operations proceed to a block 33.

In the block 33, the corrected character code A in the first character in the area 7A of the resultant buffer 7 and the image A of the first character in the working buffer 4 are stored in the first position of the learning buffer 9, as shown; the corrected character code Q of the 4th character in the area 7A and the character image ~ JA988015 1317~77 Q are stored in the next position of the learning buffer 9; and the corrected character code P of the 25th character in the area 7A and the character image P
are stored in the third position of the learning buffer 9. That is, for each completion of the recognition of one character row, the algorithm determines as to whether the correction by the operator has been made, and if made, stores the contents into the learning buffer 9. If newly corrected character is the same as the character already stored in the learning buffer 9, this new character is not stored in the buffer 9. And, whenever the correction of the character codes of one character row in the resultant buffer 7 has beerl completed, the character codes including the corrected character codes in one character row are stored in an output buffer 13. When the operations of the block 33 have been completed, the operations proceed to a block 34. In the block 33, after the character code and the character image have been stored in the learning buffer 9, the trees used for the first character row is used in the rècognition of the succeeding character rows.
That is, a new set of trees are not used when the answer of the block 34 is NO and the operations returns to the ~ JA988015 ;

~3~L7377 block 24 and proceed to the block 26 through the block 25. That is, in this case, the block 26 uses the set of trees used in the first character row rather than next new set of trees. And, the block 28 generates the output YES.

The block 34 determines whether all the character rows of one page document have been processed. If NO, the operations return to the block 24 in the Fig. 2A to process the remaining character rows. If YES, the operations proceed to a block 35 in the Fig. 2C.

..
In the block 35, the character recognition logic circuit 10 fetches the character code A and the character image A from the first position of the learning buffer ~ and performs the operations of a block 36.

Before describing the operations of the block 36, 3 x 3 pel window 41 shown in Fig. 4, a statistics table 11 shown in the Fig. 1 and a map buffer 12 shown in the Fig. 1 are described.

~ JA988015 Eight peripheral pels of the 3 x 3 pel window 41 are assigned with pel addresses, as shown in Fig. 5, and the 3 x 3 pel window 41 is so positioned that the central pel X is positioned at the upper-left corner pel of the character image, as shown in the Fig. 4.
The statistics table ll stores value representing a probability of appearing black in the central pel X or the current pel being processed, as shown in Table 1.
The bit in the pel address 1 in the 3 x 3 pel window 41 represents the most significant bit (MSB) and the bit in the pel address 8 represents the least significant bit (LSB). The map buffer 12 includes an area 12A and an area 12B. And, the size of each area is the same as the size of the frame, i.e. 9 x 11 pel, in the working buffer 4 in the Fig. 1. That is, each size of the areas 12A and 12B is the same as the size of each of the images 31, 32 and 33 shown in the Fig. 3A.

Describing the operations of the block 36, the character recognition logic circuit 10 positions the current pel X of the 3 x 3 pel window 41 at the upper left pel of the image 31 of the character A, as shown in the Fig.
4. It is noted that the pattern of the binary 1 and 0 ~ JA988015 ;
1317~7P~

in the Fig. 4 represents the image 31 shown in .he Fig.
3A. The values of the addresses 1, 2, 3, 4 and 8 of the 3 x 3 pel window 41 which are located outside of the image 31 are binary 0. In this case, the addresses 1 - 8 of the window are the binary values 00000000.
The character recognition logic circuit 10 accesses the statistics table 11 by using the above binary values as an address. The Table 1 is shown below.

1317~77 TABLE 1 ( STATISTI S TABLE 11 ) ENTRY AI)DRESSPROBABILITY OF BLACK ( 96 ) 41 00101001 . 98 JAY~ Ul:~

1317~7~

As described hereinbefore, when the current pel X of the 3 x 3 pel window 41 is positioned at the upper left pel (O, O) o~ the character image 31 in the Fig. 4, the peripheral pels of the current pel X~ i.e. the pels of the pel addresses 1 - 8 lFig. 5~ are OOOOOûOO. The character recognition logic circuit 10 accesses the entry O of the statistics table 11 by using the bit pattern 00000000 as the address, reads out the value of probability, O (%), and stores the value O (%) in the upper left position (O, O) of the area 12A of the map buffer 12, as shown in Fig. 6.

..
Next, the character recognition logic circuit 10 moves the window 41 on the image 31 of the Fig. 4 by one pel position in the rightward direction to the pel address (1, O). In this case, the bit pattern of the addresses 1 - 8 of the window 41 is 00000000, and the character recognition logic circuit 10 accesses the statistics table 11 by using the bit pattern as the address, writes the value of probability, O (%) into the pel address (1, O) of the area 12A. In this manner, the character recognition logic circuit 10 sequentially shifts the window 41 in the rightward direction, and ~317377 sequentially writes the values of probability of black in the area 12A of the map buffer 12. After co~pleting the first pel row, the character recognition logic cir-cuit 10 positions the current pel X of the window 41 at the first pel (0, 1) of the second pel row and processes the pels of the second pel row. In this manner, all the pel rows are processed. The Fig. 6 shows the values of probability of black of each pel position of the image 31 of the character A.

For example, when the current pel X of the window 41 is positioned at the pel position ~4, 0) of the Fig. 4, the bit pattern of the peripheral pels is 10001101, and the entry 141 of the statistics table 11 is accessed, and the value of probability, 95 (%) is written in the pel position (4, 0) of the area 12A in the Fig. 6.
And, when the current pel X of the window 41 is posi-tioned at the pel position (2, 2) of the Fig. 4, the bit pattern of the peripheral pels is 00010010, and the entry 18 of the statistics table 11 is accessed, and the value of probability, 26 (%) is written into the pel address (2, 2) of the area 12A of the Fig. 6.

~317377 Next, the operations proceed to a block 37 in the Fig.
2C, and the character recognition logic circuit 10 supplies the values of the probability of the region 12A to each of the character recognition trees selected in the block 29 of the Fig. 2B, i.e. the three trees 6E, 5F and 6G for the Courier 10. Describing the operations for the tree 6E shown in the Fig. 3B, the root node 301 of the tree 6E selects the pel position (4, 0). Accordingly, the value of probability, 70 (%), in the position (4, 0) of the area 12A in the Fig. 6 is fetched. This value 70% represents that, when the image of the character A shown in the Fig. 3A are printed one hundred times on different positions on a paper, the pel position (4, 6) of 70 characters, i.e. 70%, among the 100 characters are printed as black, and the pel position (4, 6) of 30 characters, i.e. 30%, among the 100 characters i5 printed as white. The operations performëd when this value is inputted to the tree 6E is described with referring to the Fig. 3D. From the above description, the probability of the white output of the root node 301 is 30%, and the probability of the black output of the root node 301 is 70%. Next, the same operations are performed at the branch nodes 302 and 303. In the branch node 302, the value of probability, 95% of the position (4, 0) is used. Accordingly, the probability of the white output of the node 302 is 30%
x 5% = 1.5%, and the probability of the black output is 30% x 95% = 28.5%. Next, in the branch node 308, the value of probability appearing black, 68% at the pel position (1, 10) of the area 12 is used. The proba-bility of the black output of the branch node 308 is 1.5% x 68% = 1.02%, and this value is the value of probability of the leaf node 311. The probability of the white output of the branch node 308 is 1.5% x 32% =
0 48%, and this value is the value of probability of the leaf node 310.

In the branch node 304, the value of probability, 0%, of the position (8, 5) in the area 12A is fetched. The probability of the black output of the branch node is 28.5% x 0% = 0%, and this value is the probability of the leaf node 312. The probability of the white output of the branch node 304 is 28.5% x 100% = 28.5%, and this valùe is the probability of the leaf node 305.
The same operations are performed at the branch nodes 303, 306 and 309 and the leaf nodes 307 and 313, and ~ JA988015 ~3173~7 the probability values shown in the Fig. 3D are gener-ated. It is noted that the probability supplied to the branch node 309 is 0%. In the case that the probability supplied to the node is 0%, the values of probability in both the white and black outputs of this branch node, so that the operations are terminated at this ~ranch node.

Next, the operations proceed to a block 38 in the Fig.
2C, the character recognition logic circuit 10 compares the value of probability of each leaf node with a pre-determined threshold value, for example 2%, selects the leaf nodes having the values of probability higher than the threshold value, and selects one of the selected leaf nodes. In the exemplary case, only two leaf nodes 305 and 307 were selected and the leaf node 305 is selected initially. Although only two leaf nodes are selected in the exemplary case, the number of the leaf nodes having the higher value than the threshold value is in the range of 10 - 20 since the actual tree has about 1500 leaf nodes.

~ JA988015 1~173~7 Next, the operations proceed to a block 39, the character recognition logic circuit 19 determines whe-ther the character code A in the learning buffer 9 being processed is the same as the character code B of ~he leaf node 305, or not. Since the answer is NO, the operations proceed to a block 40 of the Fig. 2D. If the answer is YES, the operations proceed to a block 44 without further processing the leaf nodes.

In the block 40, the character recognition logic circuit 10 fetches the character image of the leaf node 305 from the tree memory 6. In the exemplary case, the image of the character B of the leaf node 305 is fetched from the memory space 6A of the tree memory 6. The operations proceed to a block 41, and the character recognition logic circuit 10 calculates the values of probability of black for all the pels of the character image B fetched in the block 40, and write the values in the area 12B of the map buffer 12. The operations of the block 41 are the substantially same as that of the block 36. That is, by using the 3 x 3 pel window 41, the character image of the character B is scanned.
And, by using the bit pattern of the pel addresses . JA988015 1 - 8 of the window 41 as the address, the statistics table 11 is accessed, and the value of probability appearing the current pel X is sequentially written in the area 12B of the map buffer 12. Fig. 7 shows the values of probability of black in each pel position of the character B written into the area 12B in the manner described above.

Next, the operations proceed to a block 42 in the Fig.
2D, and the character recognition logic circuit 10 sequentially compares the valuës of probability stored in the same pel positions in the areas 12A and 12B of the map buffer 12 from the pel position (0, 0), and finds out a pel position at which the value in one of the areas 12A and 12B is low and the value in the other of the areas 12A and 12B is high. That is, the pel position, at which a difference in values between areas 12A and 12B is maximum, is detected. In the exemplary case, the character recognition logic circuit 10 detects that, at the pel position (1, 1), the value of probability of the area 12A is 0%, while the value of probability of the area 12B is 100%.

, JA988015 13~7377 The operations proceed to a block 43, and the character recognition logic circuit 10 deletes the original character code of the leaf node 305 selected by the block 38, writes the address (1, 1) found by the block 42 into the leaf node 305, and extends two nodes 316 and 317 from the leaf node 305, as shown in the Fig.
3E. And, the character recognition logic circuit lO
writes the code of the character, i.e. the character B, having the higher probability at the position (1, 1), as shown in the Figs. 6 and 7, into the extended node 317 connected to the black output of the leaf node 305, and writes the code of the character, i.e. the character A, having the lower probability at the position (1, 1) into the extended node 316 connected to the white output of the leaf node 305.

Next, the operations proceed to the block 44, and the character recognition logic circuit 10 determines as to whether the processes of all the nodes found by the block 38 have been completed. In the exemplary case, since the process of the node 307 has not been com-pleted, the answer of the block 44 is NO, and the operations return to the block 38. In the block 38, 13~7377 the character recognition logic circuit 10 selects the unprocessed leaf node 307, and, in the block 39, deter-mines as to whether the character code of the character A of the leaf node 307 is the same as the character code of the character A in the learning buffer 9. Since the answer of the block 39 is YES, the operations proceed to the block 44 in the Fig. 2Dr and detect that the process of all the leaf nodes 305 and 307 has been completed. And, the operations proceed to a block 45 in the Fig. 2E, the circuit 10 determines as to whether all the trees 6E, 6F and 6G have been processed. In the exemplary case, since the trees 6F and 6G have not been processed yet, the operations return to the block 37. When the modification of all the trees has been completed, the answer of the block 45 is YES, and the operations proceed to a hlock 46 in Fig. 2E. In the block 46, the character recognition logic circuit 10 determines as to whether all the characters in the learning buffer 9 have been processed, that is, all the characters to be corrected which were found on one page document have been processed. If the answer of the block 46 is NO, the operations return to the block 35, and if the answer is YES, the operations proceed to ~ JA988015 11 3~7377 a block 47. In the block 47, the character recognition logic circuit 10 outputs the results of the recognition of one page document, i.e. all the character codes stored in the output buffer 13, and stores the one set of modified trees, all the character codes and all the character image in an unused memory space, and termi-nates the character recognition and tree modification operations in a block 48.

Fig. 8 shows the operations of the nodes in the Figs.
3B, 3D and 3E. The operations for supplying the values of probability in the map buffer 12 to the branch node 302 of the tree 6E are described. Since the value of probability inputted to the branch node 302 is 30%, the input value of probability to a block 81 in the Fig. 8 is 30%. In the block 81, the value to probability Q, i.e. 95%, at the pel address (4, 0) of the area 12A is fetched. Next ? in a block 82, the value of PXQ, i.e.
30 x 95% = 28.5%, is calculated, and this calculated value is compared with the threshold value, i.e. 1%.
In this case, the answer of the block 82 is YES, the node 302, in a block 83, generates PXQ = 28.5% on the black output, as shown in the Fig. 3D. Next, in ~ JA988015 ~ .
13~7377 a block 84, P x (100 - Q), i.e. 30 x (100 - 95) = 1.5%, is compared with the threshold value 1%. In this case, since the answer of the block 84 is YES, 1.5% is gener-ated on the white output of the branch node 302.

Next, the operations for recognizing the character image stored in the working buffer 4 and having each pel represented by the binary 1, i.e. black 100%, or the binary 0, i.e. black 0%, are described.

It is assumed that the node is connected to the black output of the preceding node, the probability inputted to this node is 100%. The value of probability Q in the block 81 at the character recognition is 100% or 0%. Assuming that the value of probability Q is 100%, PXQ in the block 82 is 100%, and in the block 83, 100%
is generated on the black node of this node. Since the answer of the block 84 is NO, the white output of this node generate no output.

The block 43 in the Fig. 2D performs the following additional operations~ That is, after assigning the codes of the characters A and B to the extended leaf ~ JA988015 nodes 316 and 317, respectively, the values of proba-bility of recognition of the characters A and B are stored in these nodes. These values are calculated by the following manners.

It is assumed that, before the modification, the node 305 has the probability of recognition of 40% for the character B. The values of probability of recognition of the extended final leaf nodes 316 and 317 are decided by the probability of recognition, 40%, the probability 28.5% inputted to the node 305, and the probability in the address (1, 1) of the areas 12A and 12B of the map buffer 12 specified by the node 305.
The probability of recognition of the node 316 is calculated by the following manner. The probability appearing black in the pel position of the character A
in the node 305 is 0%, as apparent from the address (1, 1) of the Fig. 6. That is, the probability of appear-ing white in the pel position is 100%. Accordingly, the probability of recognizing the character A in the leaf node 316 is 28.5% x 100% = 28.5%, while the proba-bility of recognizing the character B is 40% x 0% = 0%.

- ''.~'7~.'.' ~ JA988015 t 13173~7 The probability of recognition in the leaf node is calculated in the following manner. The probability of appearing black in the pel (1, 1) of the character B is 100%, as apparent from the pel address ~1, lJ of the Fig. 7. Accordingly, the probability of recognizing the character B at the leaf node 317 is 40% x 100% =
40%, while the prohability of recognizing the character A is 28.5% x 0% = 0%. The probability newly assigned to these nodes 316 and 317 are used in the block 27 in the Fig. 2A.

For simpli~ying the description, the size of the frame of the working buffer 4 is 9 x 11 pels, but the frame actually has the size of 32 x 40 pels.

Although the description was directed to the learning operations of the character recognition trees, i.e. the operations for modifying the trees, the invention could be used in the algorithm performing the character recog-nition by sampling the characteristics of the characters and storing characteristics list representing the characteristics for each character.

~ JA988015 ~ 7377 Although the recognition of the alphanumeric character patterns and the learning operations have been described, the Japanese characters or finger patterns are used in place of the alphanumeric characters.

In the embodiment, the correction of the erroneously recognized character code to the correct code was made for each character row, but the operations could be performed every plural character rows. Although the trees are modified after the completion of character recognition of one page document, the modification of trees could be made after plural pages of documents have been recognized.

Claims

The embodiments of the invention in which an exclusive property or privilege is claimed are defined as follows:
(1) Apparatus for recognizing an image of pattern comprising;
first memory for storing image of pattern;
second memory for storing image of pattern erroneously recognized by pattern recognition procedure including plural recognition steps;
means for generating a probability map including plural storage positions, each of which stores values representing a probability of appearing black in each of picture elements of said image of pattern;
means for supplying said values as input data to said pattern recognition procedure to detect recognition step causing said erroneous recogni-tion; and means for modifying said detected recognition step to precisely recognize said image of pattern.

(2) Apparatus according to Claim 1, wherein said pattern recognition procedure is recognition tree including a root node, plural branch nodes and plural leaf nodes with each leaf node being assigned with a character code, and said pattern is pattern of character.
(3) Apparatus according to Claim 2, wherein each node of said recognition tree stores address data of each selected picture element of said image of pattern, has two outputs each of which is connect-ed to a succeeding node, and generates output signal depending upon a value representing a probability of appearing black in said selected picture element.

(4) Apparatus according to claim 1, wherein each of said picture elements of said image of pattern is represented by binary 1 or 0.
(5) Apparatus according to Claim 2, wherein said recognition tree is stored in a tree memory along with character patterns and character codes.
(6) Apparatus according to Claim 1, comprising an image buffer for storing images of characters of a document, means for segmenting said images of characters of one character row into separate images of each character, and means for storing said separate images of one character row into said first memory.
(7) Apparatus according to Claim 6, wherein said first memory is a working buffer.
(8) Apparatus according to Claim 6, comprising means for recognizing said separate images of said one character row, storing the results of said recognition in a resultant buffer, and storing images of said erroneously recognized characters and character codes in a learning buffer.
(9) Apparatus according to Claim 8, comprising means for fetching said image of character stored in said learning buffer, and determining said value representing a probability appearing black in a picture element of said image.
(10) Apparatus according to Claim 9, wherein said means sequentially scans each picture element and its neighboring picture elements of said image of character; accesses entry of a statistics table by using a pattern of said neighboring picture elements as an address; fetches a value stored in said entry, which represents a probability of appearing black in said central. picture element surrounded by said neighboring picture elements;
and stores said fetched value into storage loca-tion of said probability map corresponding to a position of said central picture element.

(11) Apparatus according to Claim 8, wherein said means for detecting recognition steps causing said erroneous recognition; supplies values in first probability map of said image of erroneously recognized character stored in said learning buffer to said recognition tree as input data;
detects leaf node of said recognition tree which generates a value of high probability of recog-nizing said character stored in said learning buffer; fetches a character image of character code assigned to said detected leaf node; gener-ates a second probability map of said character image of said character code assigned to said detected leaf node; compares, for each pel address of said first and second maps, the value of probability in said first map with the value of probability in said second map; and detects pel address at which the value of one of said first and second maps is high and the value of the other map of said maps is low.
(12) Apparatus according to Claim 11, wherein said means for modifying said detected recognition step:

replaces said character code of said detected leaf node by said detected pel address; extends two nodes from said detected leaf node; assigns one of said extended nodes with said character code originally assigned to said detected leaf node, and assigns the other of said extended node with said character code of said character in said learning buffer.
(13) Character recognition apparatus for recognizing images of characters of a document comprising:
storage means for storing images of characters of said document;
character recognition trees for recognizing said images of characters;
resultant buffer means for storing character codes of characters recognized with high proba-bility, and storing character codes and images of characters recognized with low probability;

means for correcting said character code recognized with low probability to a correct character code;
learning buffer for storing said correct character code and its character image;
means for forming first map which stores first values of probability appearing black in each of picture elements of said character image in said learning buffer;
means for supplying said first values of probability to said character recognition tree to detect leaf node which recognized said character image in said learning buffer;
means for forming second map which stores second values of probability appearing black in each of picture elements of character image of character code assigned to said detected leaf node;
means for comparing, for each pel position of said first and second maps, the first value in said first map with the second value in said second map to detect pel address at which the value of one of said maps is high and the value of the other of said maps is low; and means for replacing said character code of said detected leaf node by said detected pel address, extending two nodes from said detected leaf nodes, assigning one of said extended nodes with said character code originally assigned to said detected leaf node, and assigning the other of said extended nodes with said character code in said learning buffer.
CA000595251A 1988-04-28 1989-03-30 Image recognition apparatus Expired - Fee Related CA1317377C (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP104469/88 1988-04-28
JP63104469A JPH0727543B2 (en) 1988-04-28 1988-04-28 Character recognition device

Publications (1)

Publication Number Publication Date
CA1317377C true CA1317377C (en) 1993-05-04

Family

ID=14381444

Family Applications (1)

Application Number Title Priority Date Filing Date
CA000595251A Expired - Fee Related CA1317377C (en) 1988-04-28 1989-03-30 Image recognition apparatus

Country Status (4)

Country Link
US (1) US5394484A (en)
EP (1) EP0343786A3 (en)
JP (1) JPH0727543B2 (en)
CA (1) CA1317377C (en)

Families Citing this family (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4831657A (en) * 1988-07-19 1989-05-16 International Business Machines Corporation Method and apparatus for establishing pixel color probabilities for use in OCR logic
US5633948A (en) * 1992-11-30 1997-05-27 Kegelmeyer, Jr.; W. Philip Method and apparatus for detecting a desired behavior in digital image data
JP4071328B2 (en) * 1997-11-18 2008-04-02 富士通株式会社 Document image processing apparatus and method
JP4416980B2 (en) * 1999-10-25 2010-02-17 富士通株式会社 Bar code reading apparatus and bar code reading method
US6760490B1 (en) * 2000-09-28 2004-07-06 International Business Machines Corporation Efficient checking of key-in data entry
ATE530992T1 (en) * 2003-08-21 2011-11-15 Microsoft Corp ELECTRONIC INK PROCESSING
US7502812B2 (en) * 2003-08-21 2009-03-10 Microsoft Corporation Electronic ink processing
CA2470930A1 (en) 2003-08-21 2005-02-21 Microsoft Corporation Electronic ink processing
EP1656618A4 (en) * 2003-08-21 2007-10-17 Microsoft Corp Electronic ink processing
WO2005029391A1 (en) * 2003-08-21 2005-03-31 Microsoft Corporation Electronic ink processing
US7616333B2 (en) * 2003-08-21 2009-11-10 Microsoft Corporation Electronic ink processing and application programming interfaces
US7958132B2 (en) * 2004-02-10 2011-06-07 Microsoft Corporation Voting based scheme for electronic document node reuse
US7672516B2 (en) * 2005-03-21 2010-03-02 Siemens Medical Solutions Usa, Inc. Statistical priors for combinatorial optimization: efficient solutions via graph cuts
US8214754B2 (en) 2005-04-15 2012-07-03 Microsoft Corporation Registration of applications and complimentary features for interactive user interfaces
GB2449412B (en) * 2007-03-29 2012-04-25 Hewlett Packard Development Co Integrating object detectors
US9639493B2 (en) * 2008-11-05 2017-05-02 Micron Technology, Inc. Pattern-recognition processor with results buffer
US8660371B2 (en) * 2010-05-06 2014-02-25 Abbyy Development Llc Accuracy of recognition by means of a combination of classifiers
US20130039589A1 (en) * 2011-08-11 2013-02-14 I. R. I. S. Pattern recognition process, computer program product and mobile terminal
RU2598300C2 (en) * 2015-01-27 2016-09-20 Общество с ограниченной ответственностью "Аби Девелопмент" Methods and systems for automatic recognition of characters using forest solutions
RU2665261C1 (en) * 2017-08-25 2018-08-28 Общество с ограниченной ответственностью "Аби Продакшн" Recovery of text annotations related to information objects
CN110928216B (en) * 2019-11-14 2020-12-15 深圳云天励飞技术有限公司 Artificial intelligence device
KR20210092588A (en) * 2020-01-16 2021-07-26 삼성전자주식회사 Image processing apparatus and method thereof

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US2978675A (en) * 1959-12-10 1961-04-04 Bell Telephone Labor Inc Character recognition system
US4499596A (en) * 1982-06-28 1985-02-12 International Business Machines Corporation Adaptive facsimile compression using a dynamic extendable decision network
JPS60262290A (en) * 1984-06-08 1985-12-25 Hitachi Ltd Information recognition system
JPS61114387A (en) * 1984-11-09 1986-06-02 Hitachi Ltd Recognizer of on-line handwritten character
JPS6272085A (en) * 1985-09-26 1987-04-02 Toshiba Corp Character recognizing device
US4752890A (en) * 1986-07-14 1988-06-21 International Business Machines Corp. Adaptive mechanisms for execution of sequential decisions
JPS63249282A (en) * 1987-04-03 1988-10-17 Fujitsu Ltd Multifont printed character reader

Also Published As

Publication number Publication date
EP0343786A2 (en) 1989-11-29
EP0343786A3 (en) 1992-01-29
US5394484A (en) 1995-02-28
JPH01277981A (en) 1989-11-08
JPH0727543B2 (en) 1995-03-29

Similar Documents

Publication Publication Date Title
CA1317377C (en) Image recognition apparatus
EP0439951B1 (en) Data processing
US5548700A (en) Editing text in an image
US4933979A (en) Data reading apparatus for reading data from form sheet
US5410611A (en) Method for identifying word bounding boxes in text
US5278918A (en) Optical character recognition method and apparatus using context analysis and a parsing algorithm which constructs a text data tree
CA1299292C (en) Character recognition algorithm
US5509092A (en) Method and apparatus for generating information on recognized characters
US5590220A (en) Bending point extraction method for optical character recognition system
EP0163377A1 (en) Pattern recognition system
Goraine et al. Off-line Arabic character recognition
US3925760A (en) Method of and apparatus for optical character recognition, reading and reproduction
JPH1196301A (en) Character recognizing device
US5119441A (en) Optical character recognition apparatus and method using masks operation
JP3319203B2 (en) Document filing method and apparatus
US6483943B1 (en) Feature value extraction methods and apparatus for image recognition and storage medium for storing image analysis program
JP3457376B2 (en) Character correction method in optical reader
Lebourgeois et al. An OCR System for Printed Documents.
JPS6336389A (en) Character reader
JP2887823B2 (en) Document recognition device
JP2976990B2 (en) Character recognition device
JPH0981665A (en) Character input device and method therefor
JP2000207491A (en) Reading method and device for character string
Hovakimyan et al. Armenian Texts Recognition via Neural Networks
TAKAHASHI et al. A hybrid recognition algorithm with learning capability and its application to an OCR system

Legal Events

Date Code Title Description
MKLA Lapsed