« PreviousContinue »
IMAGE CLASSIFICATION AND
INFORMATION RETRIEVAL OVER
WIRELESS DIGITAL NETWORKS AND THE
CROSS REFERENCES TO RELATED
The Present application claims priority to U.S. patent application Ser. No. 11/534,667, filed on Sep. 24, 2006, 10 which claims priority to U.S. Provisional Patent Application No. 60/721,226, filed Sep. 28, 2005, now abandoned.
STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT 15
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to a method and system for classification of digital facial images over wireless digital networks or the Internet and retrieval of information associate with classified images. 25
2. Description of the Related Art
Classification of facial images using feature recognition software is currently used by various government agencies such as the Department of Homeland Security (DHS) and the 3Q Department of Motor Vehicles (DMV) for detecting terrorists, detecting suspected cases of identity fraud, automating border and passport control, and correcting mistakes in their respective facial image databases. Facial images stored in the DMV or DHS are digitized and stored in centralized data- 35 bases, along with associated information on the person. Examples of companies that provide biometric facial recognition software include Cross Match Technologies, Cognitec, Cogent Systems, and Iridian Technologies; of these, Cognitec also provides a kiosk for digitally capturing images of people 4Q for storage into their software.
Your face is an important part of who you are and how people identify you. Imagine how hard it would be to recognize an individual if all faces looked the same. Except in the case of identical twins, the face is arguably a person's most 45 unique physical characteristic. While humans have had the innate ability to recognize and distinguish different faces for millions of years, computers are just now catching up.
Visionics, a company based in New Jersey, is one of many developers of facial recognition technology. The twist to its 50 particular software, FACEIT, is that it can pick someone's face out of a crowd, extract that face from the rest of the scene and compare it to a database full of stored images. In order for this software to work, it has to know what a basic face looks like. Facial recognition software is based on the ability to first 55 recognize faces, which is a technological feat in itself, and then measure the various features of each face.
If you look in the mirror, you can see that your face has certain distinguishable landmarks. These are the peaks and valleys that make up the different facial features. Visionics 60 defines these landmarks as nodal points. There are about 80 nodal points on a human face. A few of the nodal points that are measured by the FACEIT software: distance between eyes; width of nose; depth of eye sockets; cheekbones; Jaw line; and chin. These nodal points are measured to create a 65 numerical code that represents the face in a database. This code is referred to as a faceprint and only fourteen to twenty
two nodal points are necessary for the FACEIT software to complete the recognition process.
Facial recognition methods may vary, but they generally involve a series of steps that serve to capture, analyze and compare your face to a database of stored images. The basic process that is used by the FACEIT software to capture and compare images is set forth below and involves Detection, Alignment, Normalization, Representation, and Matching. To identify someone, facial recognition software compares newly captured images to databases of stored images to see if that person is in the database.
Detection is when the system is attached to a video surveillance system, the recognition software searches the field of view of a video camera for faces. If there is a face in the view, it is detected within a fraction of a second. A multi-scale algorithm is used to search for faces in low resolution. The system switches to a high-resolution search only after a headlike shape is detected.
Alignment is when a face is detected, the system determines the head's position, size and pose. A face needs to be turned at least thirty-five degrees toward the camera for the system to register the face.
Normalization is when the image of the head is scaled and rotated so that the head can be registered and mapped into an appropriate size and pose. Normalization is performed regardless of the head's location and distance from the camera. Light does not impact the normalization process.
Representation is when the system translates the facial data into a unique code. This coding process allows for easier comparison of the newly acquired facial data to stored facial data.
Matching is when the newly acquired facial data is compared to the stored data and linked to at least one stored facial representation.
The heart of the FACEIT facial recognition system is the Local Feature Analysis (LFA) algorithm. This is the mathematical technique the system uses to encode faces. The system maps the face and creates the faceprint. Once the system has stored a faceprint, it can compare it to the thousands or millions of faceprints stored in a database. Each faceprint is stored as an 84-byte file.
Early facial recognition technology taught an identification system in which major features (e.g. the shape of a person's nose in profile) are extracted from an image and stored. The stored features are subsequently retrieved and overlaid on a current image of the person to verify identity.
Other early facial recognition taught digitizing a scanned image into binary data which is then compressed and then a sequence of coordinates and vector values are generated which describe the skeletonized image. The coordinates and vector values allow for compact storage of the image and facilitate regeneration of the image.
Technologies provided by wireless carriers and cellular phone manufacturers enable the transmission of facial or object images between phones using Multimedia Messaging Services (MMS) as well as to the Internet over Email (Simple Mail Transfer Protocol, SMTP) and Wireless Access Protocol (WAP). Examples of digital wireless devices capable of capturing and receiving images and text are camera phones provided by Nokia, Motorola, LG, Ericsson, and others. Such phones are capable of handling images as JPEGs over MMS, Email, and WAP across many of the wireless carriers: Cingular, T-Mobile, (GSM/GPRS), and Verizon (CDMA) and others.
Neven, U.S. Patent Publication 2005/0185060, for an Image Base Inquiry system For Search Engines For Mobile Telephones With Integrated Camera, discloses a system using
a mobile telephone digital camera to send an image to a server that converts the image into symbolic information, such as plain text, and furnishes the user links associated with the image which are provided by search engines.
Yanagisawa, et al., U.S. Patent Publication Number 2005/ 5 0076004, is directed at generating a database for an electronic picture book database to provide information on photographed plants and flowers.
Kanarat, U.S. Patent Publication Number 2003/0130035 teaches a stand-alone amusement device which matches 10 images of different humans. Kanarat teaches matching facial images based on attractiveness.
Kim, et al., U.S. Pat. No. 7,298,931, is based on matching a query image with database images using an iterative process. 15
Meyer, U.S. Patent Publication Number 2005/0043897, is based on using photos submitted over the Internet or by mail to match an image to a database of images to provide a user a number of images to select a matching image.
The general public has a fascination with celebrities and many members of the general public use celebrities as a standard forjudging some aspect of their life. Many psychiatrists and psychologists believe the confluence of forces coming together in technology and media have led to this celebrity worship factor in our society. One output of this celebrity factor has been a universal approach to compare or determine that someone looks like a certain celebrity. People are constantly stating that someone they meet or know looks like a celebrity, whether it is true or not. What would be helpful 3Q would be to scientifically provide a basis for someone to lay claim as looking like a certain celebrity.
BRIEF SUMMARY OF THE INVENTION
The present invention provides a novel method and system for providing the general public an expedient, inexpensive and technologically easy means for determining which celebrity someone looks like.
The invention classifies a person, or whom a person most 40 looks like, by preferably using a digital image captured by a wireless communication device (preferably a mobile telephone) or from a personal computer (PC). The image may be in a JPEG, TIFF, GIF or other standard image format. Further, an analog image may be utilized if digitized. An example is 45 which celebrity most resembles the image that was sent to the application and can be viewed by the user either through their wireless communication device or through a website. The image is sent to the wireless carrier and subsequently sent over the internet to an image classification server. Alterna- 50 tively, the digital image may be uploaded to a PC from a digital camera or scanner and then sent to the image classification server over the internet.
After an image is received by the image classification server, the image is processed into a feature vector, which 55 reduces the complexity of the digital image data into a small set of variables that represent the features of the image that are of interest for classification purposes.
The feature vector is compared against existing feature vectors in an image database to find the closest match. The 60 image database preferably contains one or more feature vectors for each target individual.
Once classified, an image of the best matching person, possibly manipulated to emphasize matching characteristics, as well as meta-data associated with the person, sponsored 65 information, similar product, inventory or advertisement is sent back to the user's PC or wireless communication device.
A more detailed explanation of a preferred method of the invention is as follows below. The user captures a digital image with a digital camera enabled wireless communication device, such as a mobile telephone. The compressed digital image is sent to the wireless carrier as a multimedia message (MMS), a short message service ("SMS"), an e-mail (Simple Mail Transfer Protocol ("SMTP")), or wireless application protocol ("WAP") upload. The image is subsequently sent over the internet using HTTP or e-mail to an image classification server. Alternatively, the digital image may be uploaded to a PC from a digital camera, or scanner. Once on the PC, the image can be transferred over the internet to the image classification server as an e-mail attachment, or HTTP upload. The user is the provider of the digital image for classification, and includes, but is not limited to a physical person, machine, or software application.
After the image is received by the image classification server, a feature vector is generated for the image. A feature vector is a small set of variables that represent the features of the image that are of interest for classification purposes. Creation and comparison of features vectors may be queued, and scaled across multiple machines. Alternatively, different feature vectors may be generated for the same image. Alternatively, the feature vectors of several images of the same individual may be combined into a single feature vector. The incoming image, as well as associate features vectors, may be stored for later processing, or added to the image database. For faces, possible feature vector variables are the distance between the eyes, the distance between the center of the eyes, to the chin, the size, and shape of the eyebrows, the hair color, eye color, facial hair if any, and the like.
After the feature vector for an image is created, the feature vector is compared against feature vectors in an image database to find the closest match. Preferably, each image in the image database has a feature vector. Alternatively, feature vectors for the image database are created from a set of faces, typically eight or more digital images at slightly different angles for each individual. Since the target individual's feature vector may be generated from several images, an optional second pass is made to find which of the individual images that were used to create the feature vector for the object best match the incoming image.
Once classified, the matching image's name and associated meta-data is retrieved from the database. Before the response is sent, the best-matching image or incoming image may be further manipulated to emphasize the similarities between the two images. This image manipulation can be automated, or can be done interactively by the user. The matching image's name, meta-data, associated image, and a copy of the incoming image are then sent back to the user's wireless communication device or PC, and also to a web page for the user.
One aspect of the present invention is a method for matching an unknown image of an individual with a known image of another individual. The method includes acquiring an unknown digital facial image of an individual human. The method also includes wirelessly transmitting the unknown digital facial image from a mobile communication device of a sender over a wireless network to a server. The method also includes analyzing the unknown digital facial image at the server to determine if a plurality of facial image factors are acceptable. The plurality of facial image factors comprises the lack of a facial image, the lack of eyes, uneven lighting, the brightness of the facial image, pose angle of the facial image, relative size of the facial image and pixel strength of the facial image. The method also includes processing the unknown digital facial image to create a processed image having a primary feature vector. The method also includes