CA2365721A1 - Determining the font of text in an image - Google Patents

Determining the font of text in an image Download PDF

Info

Publication number
CA2365721A1
CA2365721A1 CA002365721A CA2365721A CA2365721A1 CA 2365721 A1 CA2365721 A1 CA 2365721A1 CA 002365721 A CA002365721 A CA 002365721A CA 2365721 A CA2365721 A CA 2365721A CA 2365721 A1 CA2365721 A1 CA 2365721A1
Authority
CA
Canada
Prior art keywords
font
text
image
determining
probabilities
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CA002365721A
Other languages
French (fr)
Other versions
CA2365721C (en
Inventor
David Goldberg
Marshall W. Bern
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xerox Corp
Original Assignee
Xerox Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xerox Corp filed Critical Xerox Corp
Publication of CA2365721A1 publication Critical patent/CA2365721A1/en
Application granted granted Critical
Publication of CA2365721C publication Critical patent/CA2365721C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/24Character recognition characterised by the processing or recognition method
    • G06V30/242Division of the character sequences into groups prior to recognition; Selection of dictionaries
    • G06V30/244Division of the character sequences into groups prior to recognition; Selection of dictionaries using graphical properties, e.g. alphabet type or font
    • G06V30/245Font recognition

Abstract

Systems and methods are provided for automatically determining the font of text in a captured document image. Sequences of turns (left, right, straight) around the boundaries of connected components of black pixels in the captured document image are determined. The probabilities of the sequences of turns have come from a particular font within a library of known fonts can be determined using training set statistics. Using these probabilities, the most probable source font is selected.
CA002365721A 2000-12-28 2001-12-20 Determining the font of text in an image Expired - Fee Related CA2365721C (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US09/749,690 US6690821B2 (en) 2000-12-28 2000-12-28 Determining the font of text in an image
US09/749,690 2000-12-28

Publications (2)

Publication Number Publication Date
CA2365721A1 true CA2365721A1 (en) 2002-06-28
CA2365721C CA2365721C (en) 2005-07-26

Family

ID=25014764

Family Applications (1)

Application Number Title Priority Date Filing Date
CA002365721A Expired - Fee Related CA2365721C (en) 2000-12-28 2001-12-20 Determining the font of text in an image

Country Status (5)

Country Link
US (1) US6690821B2 (en)
EP (1) EP1220142B1 (en)
BR (1) BR0106463A (en)
CA (1) CA2365721C (en)
DE (1) DE60104971T2 (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7543758B2 (en) * 2005-12-20 2009-06-09 Xerox Corporation Document localization of pointing actions using disambiguated visual regions
CN100461155C (en) * 2006-04-06 2009-02-11 华为技术有限公司 Method and system for inputting and displaying character
US7480411B1 (en) * 2008-03-03 2009-01-20 International Business Machines Corporation Adaptive OCR for books
US20100329537A1 (en) * 2009-06-25 2010-12-30 Gardi Michael E Computer-implemented methods of identifying an optical character recognition (ocr) font to assist an operator in setting up a bank remittance coupon application
US9842281B2 (en) * 2014-06-05 2017-12-12 Xerox Corporation System for automated text and halftone segmentation
US11537262B1 (en) 2015-07-21 2022-12-27 Monotype Imaging Inc. Using attributes for font recommendations
US11334750B2 (en) 2017-09-07 2022-05-17 Monotype Imaging Inc. Using attributes for predicting imagery performance
US10909429B2 (en) 2017-09-27 2021-02-02 Monotype Imaging Inc. Using attributes for identifying imagery for selection
US11657602B2 (en) 2017-10-30 2023-05-23 Monotype Imaging Inc. Font identification from imagery
US10402673B1 (en) 2018-10-04 2019-09-03 Capital One Services, Llc Systems and methods for digitized document image data spillage recovery
US11074473B1 (en) 2020-01-21 2021-07-27 Capital One Services, Llc Systems and methods for digitized document image text contouring

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE3815869A1 (en) * 1987-05-08 1988-11-17 Ricoh Kk Method for the extraction of attribute quantities of a character
JP2806961B2 (en) * 1989-02-22 1998-09-30 株式会社リコー Image coding method
US5253307A (en) * 1991-07-30 1993-10-12 Xerox Corporation Image analysis to obtain typeface information
US5245674A (en) * 1991-07-30 1993-09-14 Xerox Corporation Image processing using distance as a function of direction
US5315668A (en) * 1991-11-27 1994-05-24 The United States Of America As Represented By The Secretary Of The Air Force Offline text recognition without intraword character segmentation based on two-dimensional low frequency discrete Fourier transforms
CA2125608A1 (en) * 1993-06-30 1994-12-31 George M. Moore Method and system for providing substitute computer fonts
JP3008908B2 (en) * 1997-11-10 2000-02-14 日本電気株式会社 Character extraction device and character extraction method
US6337924B1 (en) * 1999-02-26 2002-01-08 Hewlett-Packard Company System and method for accurately recognizing text font in a document processing system

Also Published As

Publication number Publication date
DE60104971D1 (en) 2004-09-23
EP1220142A2 (en) 2002-07-03
CA2365721C (en) 2005-07-26
EP1220142A3 (en) 2003-03-19
EP1220142B1 (en) 2004-08-18
DE60104971T2 (en) 2005-01-20
US20020122594A1 (en) 2002-09-05
BR0106463A (en) 2002-09-24
US6690821B2 (en) 2004-02-10

Similar Documents

Publication Publication Date Title
EP0827332A3 (en) Apparatus and method for modifying enlarged ratio or reduced ratio of image
EP0614153A3 (en) Method for segmenting features in an image.
HK1027411A1 (en) Character printing method and device as well as image forming method and device.
CA2365721A1 (en) Determining the font of text in an image
EP0991011A3 (en) Method and device for segmenting hand gestures
EP1139290A3 (en) Image processing apparatus and method
MXPA02012538A (en) Image segmentation system and method.
WO2006052858A3 (en) Apparatus and method for providing visual indication of character ambiguity during text entry
EP0622212A3 (en) Images printing method.
DE69837502D1 (en) Transmitting VBI information in digital television data streams
EP0618545A3 (en) Image processing system suitable for colored character recognition.
WO2004023787A3 (en) Signal intensity range transformation apparatus and method
EP1289303A3 (en) Image coding apparatus and image decoding apparatus
EP1282307A3 (en) Data reproduction apparatus and data reproduction method
EP0993202A3 (en) Variable rate MPEG-2 video syntax processor
EP0654749A3 (en) An image processing method and apparatus.
EP0654943A3 (en) Image enhancement method and circuit.
EP1174824A3 (en) Noise reduction method utilizing color information, apparatus, and program for digital image processing
WO2003021788A3 (en) Component-based, adaptive stroke-order system
EP0632403A3 (en) Handwritten symbol recognizer.
EP0866415A3 (en) Method of locating a machine-readable marker within an image
CA2338398A1 (en) Display capture system
EP0685959A3 (en) Image processing apparatus for identifying character, photo and dot images in the image area.
WO2002037939A3 (en) Method of constructing a composite image within an image space of a webpage
WO2002063868A3 (en) System and method for scaling and enhancing color text images

Legal Events

Date Code Title Description
EEER Examination request
MKLA Lapsed

Effective date: 20161220