US 20020131636 A1
An apparatus capable of scanning in documents is disclosed. According to one embodiment, a linear scanner is integrated with a portable computing device and exposed to a scanning document. An image is generated when there is a relative motion between the linear scanner and the scanning document. Texts in the image are reproduced by an OCR engine in the apparatus and can be displayed on a display screen of the portable computing device. The reproduced texts can be then filled in appropriate fields on records maintained in the device or transported to another device.
1. A palm office assistant comprising:
a portable computing device;
a scanner integrated with the portable computing device, the scanner exposing to a scanning object and generating an image thereof when the portable computing device and the scanning object have a relative motion; and
wherein the portable computing device includes an optical recognition engine that receives the image and produces texts therefrom.
2. The palm office assistant of
3. The palm office assistant of
4. The palm office assistant of
5. The palm office assistant of
6. The palm office assistant of
7. The palm office assistant of
8. A palm office assistant comprising:
a portable computing device including a display screen and an interface;
a pen-style scanner communicating with the portable computing device through the interface, the pen-style scanner scanning in an image of texts on a document and reproducing the texts through a build-in OCR engine; and
wherein the texts are displayed on the display screen of the portable computing device.
9. A palm office assistant comprising:
a portable computing device including a display screen;
a linear scanner integrated with portable computing device and exposing to a scanning document, the linear scanner including a contact image sensor and generating an image when the linear scanner moves across an area of the scanning document, the area including texts, wherein the image is processed in the portable computing device executing an OCR engine to reproduce the texts; and
wherein the texts can be displayed on the display screen when requested.
10. The palm office assistant of
11.The palm office assistant of
12. The palm office assistant of claim 11, wherein the portable computing device is able to transport the one or more records to another device via a communication link.
13. The palm office assistant of claim 12, wherein the communication link is either a wired link or a wireless link.
14. The palm office assistant of
15. The palm office assistant of
 1. Field of the Invention
 The present invention relates to portable optical scanners, and more particularly to a method and system for capturing alphanumeric characters, symbols or biometric tokens (i.e., fingerprints) with low cost point optical scanners integrated with portable computing devices such as palm sized computing devices, personal digital assistants (PDA's), smart phones and cellular phones with digital network connectivity (i.e., WAP phones).
 2. Description of the Related Art
 Personal Data Interchange (PDI) occurs every time two or more individuals communicate, in either a business or personal context. Such interchanges frequently include the exchange of information containing alphanumeric characters and symbols, such as business cards, purchase receipts, tickets, contracts, and other types of documents. To manage the amount of information gathered via PDI many individuals use portable hand-held computing devices such as handheld computers, personal digital assistants, smart phones and WAP enabled phones with limited user interfaces, limited power resources and limited computing and memory resources.
 Capturing the data into a portable computing device with limited user interfaces has always been a challenge. One of the popular portable computing devices (e.g. a palm handheld from Palm, Inc.) provides a writing pad from which a user can write in texts letter by letter. Typically, it takes minutes to input relevant information from a business card into such device. Although such input is tedious and laborious, a portable computing device provides more conveniences that have outweighed the awkward input mechanism.
 Many business cards have a symbolic emblem or a graphic logo beside a name, a title, one or more phone numbers, an e-mail address. One of the purposes for such graphic logo on a business card is to leave a recipient of the business card a strong impression of the business entity/relevant services/products implied in the business card. The above text-based input, however, would discard the graphic logo, which is certainly not desirable by a business entity in a business card.
 There is therefore a need for a portable device that facilitates an easy input mechanism so that a user can read in text information as well as graphic information.
 To be Finished After your Review.
 The foregoing and other objects, features and advantages of the invention will become more apparent from the following detailed description of a preferred embodiment, which proceeds with reference to the accompanying drawings.
 The present invention will be readily understood by the following detailed description in conjunction with the accompanying drawings, wherein like reference numerals designate like structural elements, and in which:
FIG. 1 is a block diagram of a personal digital assistant having a connected pen scanner which may be used to implement the method and system embodying the present invention;
FIGS. 2A and 2B illustrate a Personal Digital Assistant having an integrated business card scanners which may be used to implement the method and system embodying the present invention;
FIG. 3 illustrates a functional block diagram of the functional hardware components associated with a representative wireless communications device which may be used to implement the method and system embodying the present invention;
FIG. 4 illustrates a functional block diagram of the functional software modules associated with a representative wireless communications device which may be used to implement the method and system embodying the present invention;
FIG. 5 illustrates a representative screen display indicating the results of post capture processing of finger print information in accordance with an embodiment of the present invention;
FIG. 6A shows a diagram in which a palm office assistant may communicate with a personal computer, a wireless network or a general computing device;
FIG. 6B shows an exemplary contact image sensor that may be used in a linear scanner to be integrated with a portable computing device;
FIG. 7 is flow diagram of the process associated with the rule based processing of the captured components in accordance with an embodiment of the present invention.
 The invention pertains to portable devices equipped with methods and systems for capturing, processing, storing and augmenting alphanumeric characters, symbolic data, graphic and biometric representations resident on documents. The portable devices may include, but not be limited to, a palm sized computing device, a personal digital assistant, a smart phone and a data network enabled cell phone. According to one aspect of the present invention, a linear scanner is integrated with a portable device and serves as an input mechanism for the portable device to scan in texts, alphanumeric characters, symbolic data, graphic or biometric representations. Specifically, contents resident on documents are scanned in (i.e. captured), classified and processed using type-specific software modules resident on the subject portable device or on accessible remote server devices. One of the advantages and benefits provided in the present invention is a portable or palm office assistant, incorporating some of the basic office utilities into a portable device.
 The detailed description of the invention is presented largely in terms of procedures, steps, logic blocks, processing, and other symbolic representations that directly or indirectly resemble the operations of data processing devices coupled to networks. These process descriptions and representations are typically used by those skilled in the art to most effectively convey the substance of their work to others skilled in the art. Reference herein to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the invention. The appearances of the phrase “in one embodiment” in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Further, the order of blocks in process flowcharts or diagrams representing one or more embodiments of the invention do not inherently indicate any particular order nor imply any limitations in the invention.
 Referring now to the drawings, in which like numerals refer to like parts throughout the several views. FIG. 1 shows a basic system configuration in which the present invention may be implemented in accordance with a preferred embodiment. Personal Digital Assistant (also referred to as a PDA herein) 102 has connectivity to a wireless communications data network (not shown) such as a Wireless Access Protocol network and is connected to ScanPen 106 via a serial connection port such as a USB port. It would be understood by one of ordinary skill in the art that the present invention may be practiced using a personal digital assistant with an integrated scanning device or a scanning device connected to the personal digital assistant by a short range wireless communications means (i.e., Bluetooth or Infrared).
 ScanPen 106 is a pen-sized, beam scanning, information collection device that includes a sensor and an associated optical apparatus that collects and focus lights reflected from a document onto the sensor. ScanPen 106 may function as a 1D or 2D scanner depending on the configuration of the sensor. In operation, a user holds ScanPen 106 and moves it across a text printed on a document. As a result, an image of the text is then captured. The image may be transported to PDA 102 for Optical Character Recognition (OCR) if PDA 102 has enough computing resources. Optionally, ScanPen 106 has embedded language specific optical character recognition engin (i.e., an ASIC) which performs onboard OCR processing thereby relieving PDA 102 of a portion of the processing burden when processing characters.
 According to one embodiment of the present invention, document 110 containing text 112 and a graphic logo or fingerprint 114 may be scanned in using ScanPen 106. Text 112 is processed by the embedded OCR processing module resident within ScanPen 106 and subsequently processed by an application module resident on PDA 102. In other words, texts in the scanned image are first optically recognized and then forwarded to PDA 102. The application module in PDA 102 may parse the recognized texts and then extract text sets in accordance with a set of predefined rules. When text 112 is off a business card, first and last names, phone numbers, addresses and e-mail addresses can be extracted and filled into appropriate fields on a record.
 Graphic print 114 can be also scanned in and may be converted to a Fourier Transformed Image for identification matching if it is a personal finger print. Software modules resident on PDA 102 or a remotely accessible server device may perform the Fourier processing. If graphic print 114 is a general graphic representation, the scanned image may be retained and placed in an appropriate field, if there is one, in a record.
 According to one aspect of the present invention, a user could capture the content on a document of interest, store portions of that document in discrete applications (i.e., e-mail contact list, address books etc.) and definitively confirm the recognition of the document by looking at a display screen 116 of PDA 102. Upon process initiation, software modules resident on the terminal device or embedded as an applet or an application cause a series of screen displays and associated network interactive events (i.e., identity confirmation) to occur. The captured images and event indications may be supplemented with validation information (i.e., a time stamp, a user specific electronic signature and validation information which may take the form of alpha numeric character strings or machine readable marks) and then processed for storage and future display. This information may be stored locally and/or on a designated remote server device for future reference.
 It is defined without the loss of generality that an interaction involves activities (i.e., display content, user input/output in response to what is being displayed in the subject terminal device) performed by a user with respect to the user interface of a terminal device that may or may not be networked. Some advantages of the present invention include the ability to archive documents of interest and the ability to validate the identity of those presenting the documents. Additionally, ScanPan 106, which may be powered by PDA 102, can be designed a lot simpler than those seen in the market to perform language translation.
FIGS. 2A and 2B illustrate a PDA 202 having an integrated business card scanners 206 which may be used to implement the method and system embodying the present invention. One of ordinary skill in the art will comprehend that multiple types of integrated scanners (detachable or non-detachable) other than the business card configuration shown may be used without deviating from the principles or scope of the present invention.
 In FIG. 2A a business card 208 containing text 210, a symbol 212 and a fingerprint 214 is passed through business card scanner 206. The content is captured and buffered for processing by application (or software) modules resident on PDA 202 or on accessible remote server devices. The resolution (i.e., dpi) of the captured image components may be re-configurable depending on the scanning object. For example if Fourier identity confirmation/fingerprint analysis is required then the resolution requirements will be higher than if only simple OCR is required.
 Referring to FIG. 2B, the image of the business card 224 has been captured and is optionally displayed on the screen of PDA 202. Upon OCR processing and subsequent template matching, some of the captured contents have been associated with particular field types. <Jane W. Dow> was recognized as a name, <Gourmet Coffee Service> was recognized as a company/service name, <650-550-4011 > was recognized as a phone number and email@example.com was recognized as an email address. Since these particular string types fit a certain profile, some functionality may be applied to them. For example, icon 218 is a link which will enable a phone call to be placed without having to dial the number, Icon 220 is a link which will automatically fill in the <To:> space for an e-mail. Additionally, icon 222 provides that a fingerprint image has been captured. The fingerprint image may be used for identity confirmation as will be further described below.
FIG. 3 illustrates a functional block diagram of the functional hardware components associated with a representative wireless communications device which may be used to implement the method and system embodying the present invention. The device includes an integrated scanner 304 as described above, a microprocessor such as an ARM™ processor 308, an LCD and associated control circuitry 312, a user interface 316 (i.e., keypad/keyboard and a pointing device), power management circuitry 320, memory 324 (RAM and ROM), a power supply 328 and a transceiver and associated circuitry for communication with wireless data networks (i.e., GSM, TDMA, CDMA, PCS etc.). All of the parts may communicate through a data bus controlled by the microprocessor. The detailed designs of the bus as well as respective interfaces of the parts to the bus are known to those skilled in the art.
FIG. 4 illustrates a functional block diagram of the functional software modules associated with a representative wireless communications device which may be used to implement the method and system embodying the present invention. System configuration 400 includes a device driver 404 which facilitates communication between the processing circuitry and an attached scanning device, an application program interface (API) 408, a variable power management module 412 which maximizes device alive time between charges, scanner control module 416 which controls the available scanning modes (i.e., 1D, 2D, max resolution etc.) depending on the power management scheme implemented by variable power management module 412, and the various processing modules required to process distinct captured component types.
 Optionally, fingerprint recognition control module 420 identifies captured content types corresponding to images of fingerprints and transforms those images (i.e., Fourier Transformed images) to a format that can be processed for identity confirmation. Image compression module 424 compresses the captured content so as to maximize storage efficiency and transmission bandwidth utilization. OCR engine module 428 coordinates the activities of the embedded OCR processing element associated with the scanning mechanism (i.e., language selection, predictive text processing etc.). Template matching module 432 matches distinct string types (i.e., names, phone numbers, email addresses) using pre-defined rules and implements usable links based on type classification (i.e., speed dialing a phone number by interacting with a link). Software module 434 and 438 coordinate the configuration of the sensing elements in accordance to optimized configurations for scanning components (text or images respectively).
FIG. 5 illustrates a representative screen display indicating the results of post capture processing of finger print information in accordance with an embodiment of the present invention. In this illustrative example, captured fingerprint image 506 is compared with previously stored transformed images housed at Trusted Third Party Service Provider (i.e., a certification authority) and if a match is found, selected information may be returned to the requesting entity. In the screen shot provided a picture of the individual 508, the identity of the certification authority 510 and links to some commercial material provided by the person associated with the business card.
FIG. 6A shows a diagram 600 in which a palm office assistant 602 may communicate with a personal computer 606, a wireless network 608 or a general computing device 610. Palm office assistant 602 may correspond to device 102 of FIG. 1 or device 202 of FIG. 2. One of the features in palm office assistant 602 is the scanning mechanism that facilitates image scanning of a scanning object that may include texts and graphic representations. With the scanning mechanism, a user of palm office assistant 602 can input lengthy texts with ease, substantially solving the text input problem commonly seen on those portable devices without a full functional keyboard.
 According to one embodiment, palm office assistant (POA) 602 employs a contact image sensor 620 as shown in FIG. 6B. Referring now to FIG. 6B, there is shown a cross section view of an exemplary contact image sensor module. A light source 622 provides illumination to a scanning object (not shown) under a cover glass 624. For color scanning, Light source 622 provides 3 different illuminations, e.g. red, green, and blue lights, to the scanning object. The scanning object, not shown in the figure, may be a sheet of paper (e.g. a business card) placed face up and scrolled or moved across the cover glass 624 such that the scanning side is illuminated by the light source 622. The cover glass 624 is transparent and provides a focus means for the paper to be properly scanned. When the light source 622 emits light onto the paper as indicated by 626, the light reflected from the paper through the cover glass 624 is directed at the optical lens 628 which is generally an array of erect graded index micro to (cylindrical or rod) lens. It should be understood that the present invention is independent of the optical lens and the light source. The use of the particular light source and the lens array in this configuration facilitate the description of the present invention and impose no limitation thereof. Under the optical lens 628, there is an image sensor 630 comprising an array of photodetectors made of CMOS or CCD sensors. The array can be configured as one-dimensional array or two-dimensional array, often referred to linear sensor or area sensor respectively. It should be noted that the description herein is based on the linear sensor, those skilled in the art will appreciate that the principles of the present invention can be equally applied to the two-dimensional array as well. The optical lens 628 collects the reflected light onto the photodetectors that convert the reflected light to electronic signals proportionally representing the intensity of the reflected light. The electronic signals are then transferred to the data bus 632 through a connector 634.
 For a scanning object under cover glass 624 to be completely scanned, the scanning object and image sensor 630 have to move against each other. According to one embodiment, an optical encoder 636 is used to synchronize the motion of the scanning object with image sensor 630. In operation, a user may move a portable device integrated with a contact image sensor across one or more lines of text strings on a scanning object. A contact motion between the scanning object and the contact image sensor causes optical encoder 636 to rotate so as to record the motion (i.e. speed). The recorded motion from optical encoder 636 is used to synchronize the generation of the image data. The detailed designs of the synchronization circuitry with an optical encoder is known to those skilled in the art and is not to be described herein to avoid obscuring aspects of the present invention. Alternatively, other motion mechanism may be used to cause a relative motion between the scanner and a scanning object.
 Returning now back to FIG. 6A, after POA 602 receives the text information from the captured image(s), the text information may be transported via a cable (e.g. RS232) to a personal computer 606 that may operate an application to store the text information for other practical uses. For example, the application on personal computer 606 is Microsoft Outlook that includes a contacts database. When the text information is from a business card, the text information may be parsed in POA 602 and appropriate contents are extracted and filled into related fields of a record. Hence POA 602 maintains the record for display when a user requests the record. The record can be then transported to the contacts database in personal computer 606 so that the contacts database is able to keep updated.
 In another embodiment, POA 602 is able to transport recognized texts as well as captured graphic representations to a remote device via a wireless network 608. In this embodiment, POA 602 is equipped with a transceiver to communicate with wireless network 608. In addition, POA 602 may be equipped with various interfaces such as it can properly communicate with other computing devices 610. In one embodiment, POA 608 has an infrared (IR) port that can communicate with a printer or fax machine that is also equipped with an IR port. Through the infrared, data (e.g. recognized texts or graphic images) could be transported to the printer or the fax machine. In essence, POA 602, as the name suggests, provides some of the basic office utilities.
FIG. 7 is flow diagram of the process 700 associated with the rule based processing of the captured components in accordance with an embodiment of the present invention. At 704 a determination is made as to whether a component being processed is comprised of character strings (i.e., text). If a text component is detected then it is processed using optical character recognition (OCR) at 712 and matched to a template filter at 716 where pre-defined string types (i.e., names, phone numbers and e-mail addresses, etc.) are identified for special processing with the results of processing being displayed at 748. At 720 a determination is made as to whether the captured component is a fingerprint. If the image is a fingerprint then a transform image (i.e., Fourier Transform image) is generated at 728 and if identity confirmation is required (732) then the transformed image is forwarded to a certification authority (736) for processing and a confirmation message is generated at 740 with the results being displayed at 748. If the image is neither text nor a fingerprint image then it is processed as a general image at 724 with the results being displayed as required at 748.
 The present invention has been described in sufficient detail with a certain degree of particularity. It is understood to those skilled in the art that the present disclosure of embodiments has been made by way of examples only and that numerous changes in the arrangement and combination of parts may be resorted without departing from the spirit and scope of the invention as claimed. While the embodiments discussed herein may appear to include some limitations as to the presentation of the information units, in terms of the format and arrangement, the invention has applicability well beyond such embodiment, which can be appreciated by those skilled in the art. Accordingly, the scope of the present invention is defined by the appended claims rather than the forgoing description of embodiments.