|Publication number||US6870549 B1|
|Application number||US 09/724,181|
|Publication date||Mar 22, 2005|
|Filing date||Nov 28, 2000|
|Priority date||Jun 30, 2000|
|Also published as||DE60102928D1, DE60102928T2, EP1299857A2, EP1299857B1, WO2002003330A2, WO2002003330A3|
|Publication number||09724181, 724181, US 6870549 B1, US 6870549B1, US-B1-6870549, US6870549 B1, US6870549B1|
|Inventors||Robert Edward Meredith Swann, Robert Wei Liang Tan|
|Original Assignee||Consignia Plc|
|Export Citation||BiBTeX, EndNote, RefMan|
|Patent Citations (14), Non-Patent Citations (3), Referenced by (9), Classifications (13), Legal Events (4)|
|External Links: USPTO, USPTO Assignment, Espacenet|
This invention relates to image processing and is of particular benefit to clustering data in binary images.
In a number of image processing applications there is a requirement to cluster objects together, i.e. objects which are related due to their close proximity to each other. For example in document image processing, text objects that are close to each other may be clustered together to form paragraphs. With binary image data this clustering can be performed by merging, i.e. removing gaps between objects that are less than a specified distance limit. The difficulty in merging binary objects into clusters is the determination of the distance limit. If the distance limit is too small then some related objects remain separate. If the distance limit is too large then gaps between clusters may be removed incorrectly.
If the separation between clusters is significantly larger than the gaps between objects within a cluster, then the setting of the distance limit is easy. In the cases where the separation between clusters is similar to the gaps between objects in a cluster, deciding the distance limit for merging is more difficult. If the separation between clusters is smaller than the gaps between objects in a cluster then it may be impossible for a single distance limit to merge objects into clusters without also incorrectly joining clusters together. In the example of text processing of document images, merging text into paragraphs is easy if the separation of the paragraphs is significantly larger than the gap between letters. However if the gap between letters is less than the separation between paragraphs (as is often the case in a document with many fonts and sizes of text), then it may not be possible to cluster all the text into paragraphs successfully using a simple binary merging operation.
In image processing applications where the clustering of binary objects is difficult because of the close proximity of the clusters, it is often helpful to use additional information to segment the binary image. The information used to segment the binary image is generally more useful if taken from a separate source or earlier form of the image. In the example of text processing in document images, the binary image of the text objects may be segmented according to background colour, calculated from the original greyscale image of the document. Unfortunately segmentation of an image can be difficult and many techniques do not adequately account for slowly varying features or incomplete region boundaries.
We have appreciated that the process of segmenting and merging in order to cluster objects in a binary image may be made more successful and computationally efficient if they are combined into a single process, where the segmentation information is represented as the boundaries between regions. Accordingly a preferred embodiment of the invention clusters together objects in a binary image by removing the gaps between objects in the cases where the gaps are less than a specified distance limit and do not cross a region boundary.
We have observed that the merging of objects in a binary image into clusters can benefit from segmentation of that image. If the segmentation can separate clusters without dissecting them then it reduces the likelihood of incorrectly merging clusters together. This can simplify the requirements of the merging operation making it easier to successfully set a distance limit for the merging. We have also observed that the requirement of merging objects in a binary image into clusters also simplifies the task of segmentation. As an isolated task the segmentation would need to be able to separate the whole image into distinct regions. However the merging operation has a distance limit which will keep clusters that are well separated isolated. Thus the demands on the segmentation are reduced to separating regions where the clusters would otherwise be merged together. The benefit of the invention performing both the merging and segmentation simultaneously is to take advantage of the reduced requirements of the segmentation information and the simplification in the setting of the merging distance limit.
A preferred embodiment of the invention will now be described in detail by way of an example with reference to the accompanying drawings in which:
In an example of identifying addresses on complex envelopes in a mail processing system there is a requirement to cluster text objects into paragraphs that may be addresses. As with many other image processing applications the merging of binary text objects into paragraphs can benefit from the use of additional information to segment the image. Accordingly in the case of processing document images such as mail we have proposed in our British Patent Application No. GB 2 364 416 A (Corresponding PCT Pub. No. WO 02/03315 A1) filed on the same day as the current application, a method of clustering related text objects in an image document. The aforementioned patent uses two segmentations, one according to text colour and one according to background colour, to segment the binary image of the text objects. In the example shown
At the same time, the greyscale image passes to a global information segmentation unit 6 which creates a segmentation for the image based on the background greyscale level by defining regions where the grey level is the same or within a predetermined range. This could also use colour information. This segmentation data relating to background grey level is then passed to a segmentation unit 8 which also receives the binary image from the text object extraction unit 4. Segmentation Unit 8 then segments the binary image of text according to the global background information supplied to it.
The output data from segmentation unit 8 is then passed to the merging unit 10 which, for each segmented region, merges text objects according to the distance between letters. Letters which are separated by less than a predetermined distance are merged together and ones which are further apart are not. This produces text blocks which are passed to a sorting unit 12. This sorts the text blocks according to those most likely to contain an address (in the case of a mail processing system) and passes these to an optical character recognition (OCR) unit 14.
Steps 8 and 10 in
An embodiment of the invention as shown in
The system of
The segmentation and merging unit 24 then merges together objects in the binary image without crossing any region boundaries. Merging is performed by removing gaps between objects that are below the merging distance limit and which do not cross a region boundary. Normally, such a process will be performed by conventional hardware programmed with software to perform this process. However, dedicated hardware circuitry could be provided to implement the segmentation and merging. The benefit of using boundaries for segmentation is illustrated in
A further advantage of this simultaneous merging and segmentation is that, whereas normal segmentation information needs to be able to segment the whole image, in this particular system, it only needs to represent a region boundary. This can be merely a line. It does not need to enclose the distinct region. When used with document image processing, the text objects, the background colour, text colour, text orientation, etc., can all be used to segment the whole image. However, with the current technique, incomplete boundaries such as bold lines, location of images and logos, etc., can all be used successfully to aid clustering of text objects. In addition, repetitive segmentations are normally computationally intensive. The present technique requires only a binary image of the lines not to be crossed during merging. Thus, the multiple segmentations represent a case of Or-ing a number of binary images to create a complete binary image of the region boundaries. This is computationally much less intensive.
The embodiment shown in
The invention performs the clustering of objects in a binary image where the clusters are described by a maximum distance allowed between objects and by some information that implies segmentation between clusters. The segmentation information is supplied as a binary image of region boundary lines that are not crossed during the clustering operation. The region boundaries do not have to be complete. Since the invention is a general image processing technique for clustering of binary objects there are numerous applications. The main example used in this description has been the clustering of text objects into paragraphs in document images. Other applications could include:
|Cited Patent||Filing date||Publication date||Applicant||Title|
|US5513277 *||Jun 14, 1993||Apr 30, 1996||Xerox Corporation||Measuring character and stroke sizes and spacings for an image|
|US5519789 *||Oct 26, 1993||May 21, 1996||Matsushita Electric Industrial Co., Ltd.||Image clustering apparatus|
|US5740268 *||Sep 29, 1995||Apr 14, 1998||Arch Development Corporation||Computer-aided method for image feature analysis and diagnosis in mammography|
|US5859929||Dec 1, 1995||Jan 12, 1999||United Parcel Service Of America, Inc.||System for character preserving guidelines removal in optically scanned text|
|US5956468||Jan 10, 1997||Sep 21, 1999||Seiko Epson Corporation||Document segmentation system|
|US6038340 *||Nov 8, 1996||Mar 14, 2000||Seiko Epson Corporation||System and method for detecting the black and white points of a color image|
|US6044179||Nov 26, 1997||Mar 28, 2000||Eastman Kodak Company||Document image thresholding using foreground and background clustering|
|US6072904 *||Dec 31, 1997||Jun 6, 2000||Philips Electronics North America Corp.||Fast image retrieval using multi-scale edge representation of images|
|US6151424 *||Sep 9, 1996||Nov 21, 2000||Hsu; Shin-Yi||System for identifying objects and features in an image|
|US6233353 *||Jun 29, 1998||May 15, 2001||Xerox Corporation||System for segmenting line drawings from text within a binary digital image|
|US6259827 *||Mar 21, 1996||Jul 10, 2001||Cognex Corporation||Machine vision methods for enhancing the contrast between an object and its background using multiple on-axis images|
|US6278446 *||Feb 23, 1998||Aug 21, 2001||Siemens Corporate Research, Inc.||System for interactive organization and browsing of video|
|US6536639 *||Apr 24, 2001||Mar 25, 2003||Christopher B. Frank||Skateboard carrying strap and methods of making the same|
|GB618545A2||Title not available|
|1||International Search Report mailed Jun. 5, 2002 in PCT/GB01/02946 (3 pages).|
|2||Suen, H-M., et al., "Text string extraction from images of colour-printed documents", IEE Proceedings: Vision, Image and Signal Processing, Institution Of Electrical Engineers, GB, vol. 143, No. 4 (Aug. 27, 1996), pp. 210-216.|
|3||Whichello, A. P., et al., "Fast location of address blocks and postcodes in mail-piece images", Pattern Recognition Letters, North-Holland Publ., Amsterdam, NL, vol. 17, No. 11 (Sep. 16, 1999), pp. 1199-1214.|
|Citing Patent||Filing date||Publication date||Applicant||Title|
|US7603351 *||Apr 19, 2006||Oct 13, 2009||Apple Inc.||Semantic reconstruction|
|US7925653 *||Feb 27, 2008||Apr 12, 2011||General Electric Company||Method and system for accessing a group of objects in an electronic document|
|US9129044||Apr 29, 2011||Sep 8, 2015||Cornell University||System and method for radiation dose reporting|
|US9196047 *||Dec 23, 2010||Nov 24, 2015||Manipal Institute Of Technology||Automated tuberculosis screening|
|US20070250497 *||Apr 19, 2006||Oct 25, 2007||Apple Computer Inc.||Semantic reconstruction|
|US20090216794 *||Feb 27, 2008||Aug 27, 2009||General Electric Company||Method and system for accessing a group of objects in an electronic document|
|US20120177279 *||Dec 23, 2010||Jul 12, 2012||Manipal Institute Of Technology||Automated Tuberculosis Screening|
|CN101719277B||Dec 31, 2009||Nov 30, 2011||华中科技大学||一种遗传模糊聚类图像分割方法|
|WO2011137374A1 *||Apr 29, 2011||Nov 3, 2011||Cornell University||System and method for radiation dose reporting|
|U.S. Classification||345/636, 382/180, 382/173|
|Cooperative Classification||G06T2207/30176, G06T7/11, G06K9/342, G06K9/38, G06T2207/10008, G06K2209/01|
|European Classification||G06K9/38, G06K9/34C, G06T7/00S1|
|May 21, 2001||AS||Assignment|
Owner name: CONSIGNIA PLC, ENGLAND
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:SWANN, ROBERT EDWARD MEREDITH;TAN, ROBERT WEI LIANG;REEL/FRAME:011803/0674
Effective date: 20010417
|Sep 29, 2008||REMI||Maintenance fee reminder mailed|
|Mar 22, 2009||LAPS||Lapse for failure to pay maintenance fees|
|May 12, 2009||FP||Expired due to failure to pay maintenance fee|
Effective date: 20090322