Search Images Maps Play YouTube News Gmail Drive More »
Sign in
Screen reader users: click this link for accessible mode. Accessible mode has the same essential features but works better with your reader.

Patents

  1. Advanced Patent Search
Publication numberUSRE36041 E
Publication typeGrant
Application numberUS 08/340,615
Publication dateJan 12, 1999
Filing dateNov 16, 1994
Priority dateNov 1, 1990
Fee statusPaid
Also published asDE69130616D1, DE69130616T2, EP0555380A1, EP0555380A4, EP0555380B1, US5164992, WO1992008202A1
Publication number08340615, 340615, US RE36041 E, US RE36041E, US-E-RE36041, USRE36041 E, USRE36041E
InventorsMatthew Turk, Alex P. Pentland
Original AssigneeMassachusetts Institute Of Technology
Export CitationBiBTeX, EndNote, RefMan
External Links: USPTO, USPTO Assignment, Espacenet
Face recognition system
US RE36041 E
Abstract
A recognition system for identifying members of an audience, the system including an imaging system which generates an image of the audience; a selector module for selecting a portion of the generated image; a detection means which analyzes the selected image portion to determine whether an image of a person is present; and a recognition module responsive to the detection means for determining whether a detected image of a person identified by the detection means resembles one of a reference set of images of individuals.
Images(1)
Previous page
Next page
Claims(25)
What is claimed is:
1. A recognition system for identifying members of an audience, the system comprising:
an imaging system which generates an image of the audience;
a selector module for selecting a portion of said generated image;
means for representing a reference set of images of individuals as a set of eigenvectors in a multi-dimensional image space;
a detection means which determines whether the selected image portion contains an image that can be classified as an image of a person, said detection means including means for representing said selected image portion as an input vector in said multi-dimensional image space and means for computing the distance between a point identified by said input vector and a multi-dimensional subspace defined by said set of eigenvectors, wherein said detection means uses the computed distance to determine whether the selected image portion contains an image that can be classified as an image of a person; and
a recognition module responsive to said detection means for determining whether a detected image of a person identified by said detection means resembles one of the reference set of images of individuals.
2. The recognition system of claim 1 wherein said detection means further comprises a thresholding means for determining whether an image of a person is present by comparing said computed distance to a preselected threshold.
3. The recognition system of claim 1 wherein said . .selection means.!. .Iadd.selector module .Iaddend.comprises a motion detector for identifying the selected portion of said image by detector motion.
4. The recognition system of claim 3 wherein said . .selection means.!. .Iadd.selector module .Iaddend.further comprises a locator module for locating the portion of said image corresponding to a face of the person based on motion detected by said motion detector.
5. The recognition system of claim 1 wherein said image of a person is an image of a person's face and wherein said reference set comprises images of faces of said individuals.
6. The recognition system of claim 1 wherein said recognition module comprises means for representing each member of said reference set as a corresponding point in said subspace.
7. The recognition system of claim 6 wherein the location of each point in subspace associated with a corresponding member of said reference set is determined by projecting a vector associated with that member onto said subspace.
8. The recognition system of claim 7 wherein said recognition module further comprises means for projecting said input vector onto said subspace.
9. The recognition system of claim 8 wherein said recognition module further comprises means for selecting a particular member of said reference set and means for computing a distance within said subspace between a point identified by the projection of said input vector onto said subspace and the point in said subspace associated with said selected member.
10. The recognition system of claim 8 wherein said recognition module further comprises means for determining for each member of said reference set a distance in subspace between the location associated with that member in subspace and the point identified by the projection of said input vector onto said subspace.
11. The recognition system of claim 10 wherein said image of a person is an image of a person's face and wherein said reference set comprises images of faces of said individuals.
12. A method for identifying members of an audience, the method comprising:
generating an image of the audience;
selecting a portion of said generated image;
representing a reference set of images of individuals as a set of eigenevectors in a multi-dimensional image space;
representing said selected image portion as an input vector in said multi-dimensional image space;
computing the distance between a point identified by said input vector and a multi-dimensional subspace defined by said set of eigenvectors;
using the computed distance to determine whether the selected image portion contains an image that can be classified as an image of a person; and
if it is determined that the selected image contains an image that can be classified as an image of a person determining whether said image of a person resembles one of a reference set of images of individuals.
13. The method of claim 12 further comprising the step of determining which one, if any, of the members of said reference set said image of a person resembles.
14. The method of claim 12 wherein the image of the audience is a sequence of image frames and wherein the method further comprises detecting motion within the sequence of image frames and wherein the selected image portion is determined on the basis of the detected motion.
15. The method of claim 12 wherein the step of determining whether the selected image portion contains an image that can be classified as an image of a person further comprises comparing said computed distance to a preselected threshold.
16. The method of claim 15 wherein the step of determining whether said image of a person resembles a member of said reference set comprises representing each member of said reference set as a corresponding point in said subspace.
17. The method of claim 16 wherein the step of determining whether said image of a person resembles a member of said reference set further comprises determining the location of each point in subspace associated with a corresponding member of said reference set by projecting a vector associated with that member onto said subspace.
18. The method of claim 17 wherein the step of determining whether said image of a person resembles a member of said reference set further comprises projecting said input vector onto said subspace.
19. The method of claim 18 wherein the step of determining whether said image of a person resembles a member of said reference set further comprises selecting a member of said reference set and computing a distance within said subspace between a point identified by the projection of said input vector onto said subspace and the point in said subspace associated with said selected member.
20. The method of claim 18 wherein the step of determining whether said image of a person resembles a member of said reference set further comprises determining for each member of said reference set a distance in subspace between the location for that member in subspace and the point identified by the projection of said input vector onto said subspace.
21. The method of claim 20 wherein said image of a person is an image of a person's face and wherein said reference set comprises images of faces of said individuals. .Iadd.
22. A recognition system comprising:
an imaging system which generates an image;
a selector module for selecting a portion of said generated image;
means for representing a reference set of images of individuals as a set of eigenvectors in a multi-dimensional image space;
a detection means which determines whether the selected image portion contains an image that can be classified as an image of a person, said detection means including means for representing said selected image portion as an input vector in said multi-dimensional image space and means for computing the distance between a point identified by said input vector and a multi-dimensional subspace defined by said set of eigenvectors, wherein said detection means uses the computed distance to determine whether the selected image portion contains an image that can be classified as an image of a person; and
a recognition module responsive to said detection means for determining whether a detected image of a person identified by said detection means resembles one of the reference set of images of individuals. .Iaddend..Iadd.23. The recognition system of claim 22 wherein said detection means further comprises a thresholding means for determining whether an image of a person is present by comparing said computed distance to a preselected threshold. .Iaddend..Iadd.24. The recognition system of claim 22 wherein said image of a person is an image of a person's face and wherein said reference set comprises images of faces of said individuals. .Iaddend..Iadd.25. The recognition system of claim 22 wherein said recognition module comprises means for representing each member of said reference set as a corresponding point in said subspace.
.Iaddend..Iadd.26. The recognition system of claim 25 wherein the location of each point in subspace associated with a corresponding member of said reference set is determined by projecting a vector associated with that member onto said subspace. .Iaddend..Iadd.27. The recognition system of claim 26 wherein said recognition module further comprises means for projecting said input vector onto said subspace. .Iaddend..Iadd.28. The recognition system of claim 27 wherein said recognition module further comprises means for selecting a particular member of said reference set and means for computing a distance within said subspace between a point identified by the projection of said input vector onto said subspace and the point in said subspace associated with said selected member. .Iaddend..Iadd.29. The recognition system of claim 27 wherein said recognition module further comprises means for determining for each member of said reference set a distance in subspace between the location associated with that member in subspace and the point identified by the projection of said input vector onto said subspace. .Iaddend..Iadd.30. The recognition system of claim 24 wherein said means for representing said reference set includes means for adding a member to said reference set by protecting into said subspace an input vector having a computed distance indicative of an image of a face. .Iaddend..Iadd.31. A method comprising:
generating an image;
selecting a portion of said generated image;
representing a reference set of images of faces of individuals as a set of eigenvectors in a multi-dimensional image space;
representing said selected image portion as an input vector in said multi-dimensional image space;
computing the distance between a point identified by said input vector and a multi-dimensional subspace defined by said set of eigenvectors;
using the computed distance to determine whether the selected image portion contains an image that can be classified as an image of a person's face; and
if it is determined that the selected image contains an image that can be classified as an image of a person's face, determining whether said image of a person's face resembles one of a reference set of images of faces of
individuals. .Iaddend..Iadd.32. The method of claim 31 further comprising the step of determining which one, if any, of the members of said reference set said image of a person's face resembles. .Iaddend..Iadd.33. The method of claim 31 wherein the step of determining whether the selected image portion contains an image that can be classified as an image of a person's face further comprises comparing said computed distance to a preselected threshold. .Iaddend..Iadd.34. The method of claim 33 wherein the step of determining whether said image of a person's face resembles a member of said reference set comprises representing each member of said reference set as a corresponding point in said subspace. .Iaddend..Iadd.35. The method of claim 34 wherein the step of determining whether said image of a person's face resembles a member of said reference set further comprises determining the location of each point in subspace associated with a corresponding member of said reference set by projecting a vector associated with that member onto said subspace.
.Iaddend..Iadd. The method of claim 35 wherein the step of determining whether said image of a person's face resembles a member of said reference set further comprises projecting said input vector onto said subspace. .Iaddend..Iadd.37. The method of claim 36 wherein the step of determining whether said image of a person's face resembles a member of said reference set further comprises determining for each member of said reference set a distance in subspace between the location for that member in subspace and the point identified by the projection of said input vector onto said subspace. .Iaddend.
Description
BACKGROUND OF THE INVENTION

The invention relates to a system for identifying members of a viewing audience.

For a commercial television network, the cost of its advertising time depends critically on the popularity of its programs among the television viewing audience. Popularity, in this case, is typically measured in terms of the program's share of the total audience viewing television at the time the program airs. As a general rule of thumb, advertisers prefer to place their advertisements where they will reach the greatest number of people. Thus, there is a higher demand among commercial advertisers for advertising time slots along side more popular programs. Such time slots can also demand a higher price.

Because the economics of television advertising depends so critically on the tastes and preferences of the television audience, the television industry invests a substantial amount of time, effort and money in measuring those tastes and preferences. One preferred approach involves monitoring the actual viewing habits of a group of volunteer families which represent a cross-section of all people who watch television. Typically, the participants in such a study allow monitoring equipment to be placed in their homes. Whenever a participant watches a television program, the monitoring equipment records the time, the identity of the program and the identity of the members of the viewing audience. Many of these systems require active participation by the television viewer to obtain the monitoring information. That is, the viewer must in some way interact with the equipment to record his presence in the viewing audience. If the viewer forgets to record his presence the monitoring statistics will be incomplete. In general, the less manual intervention required by the television viewer, the more likely it is that the gathered statistics on viewing habits will be complete and error free.

Systems have been developed which automatically identify members of the viewing audience without requiring the viewer to enter any information. For example, U.S. Pat. No. 4,858,000 to Daozehng Lu, issued Aug. 15, 1989 describes such a system. In the system, a scanner using infrared detectors locates a member of the viewing audience, captures an image of the located member, extracts a pattern signature for the captured image and then compares the extracted pattern signature to a set of stored pattern image signatures to identify the audience member.

SUMMARY OF THE INVENTION

In general, in one aspect, the invention is a recognition system for identifying members of an audience. The invention includes an imaging system which generates an image of the audience; a selector module for selecting a portion of the generated image; a detection means which analyzes the selected image portion to determine whether an image of a person is present; and a recognition module for determining whether a detected image of a person resembles one of a reference set of images of individuals.

Preferred embodiments include the following features. The recognition module also determines which one, if any, of the individuals in the reference set the detected image resembles. The selection means includes a motion detector for identifying the selected portion of the image by detecting motion and it includes a locator module for locating the portion of the image corresponding to the face of the person detected. In the recognition system, the detection means and the recognition module employ a first and second pattern recognition techniques, respectively, to determine whether an image of a person is present in the selected portion of the image and both pattern recognition techniques employ a set of eigenvectors in a multi-dimensional image space to characterize the reference set. In addition, the second pattern recognition technique also represents each member of the reference set as a point in a subspace defined by the set of eigenvectors. Also, the image of a person is an image of a person's face and the reference set includes images of faces of the individuals.

Also in preferred embodiments, the recognition system includes means for representing the reference set as a set of eigenvectors in a multi-dimensional image space and the detection means includes means for representing the selected image portion as an input vector in the multi-dimensional image space and means for computing the distance between a point identified by the input vector and a subspace defined by the set of eigenvectors. The detection means also includes a thresholding means for determining whether an image of a person is present by comparing the computed distance to a preselected threshold. The recognition module includes means for representing each member of the reference set as a corresponding point in the subspace. To determine the location of each point in subspace associated with a corresponding member of the reference set, a vector associated with that member is projected onto the subspace.

The recognition module also includes means for projecting the input vector onto the subspace, means for selecting a particular member of the reference set, and means for computing a distance within the subspace between a point identified by the projection of the input vector onto the subspace and the point in the subspace associated with the selected member.

In general, in another aspect, the invention is a method for identifying members of an audience. The invention includes the steps of generating an image of the audience; selecting a portion of the generated image; analyzing the selected image portion to determine whether an image of a person is present; and if an image of a person is determined to be present, determining whether the image of a person resembles one of a reference set of images of individuals.

One advantage of the invention is that it is fast, relatively simple and works well in a constrained environment, i.e., an environment for which the associated image remains relatively constant except for the coming and going of people. In addition, the invention determines whether a selected portion of an image actually contains an image of a face. If it is determined that the selected image portion contains an image of a face, the invention then determine which one of a reference set of known faces the detected face image most resembles. If the detected face image is not present among the reference set, the invention reports the presence of a unknown person in the audience. The invention has the ability to discriminate face images from images of other objects.

Other advantages and features will become apparent from the following description of the preferred embodiment and from the claims.

DESCRIPTION OF THE PREFERRED EMBODIMENT

FIG. 1 is a block diagram of a face recognition system;

FIG. 2 is a flow diagram of an initialization procedure for the face recognition module;

FIG. 3 is a flow diagram of the operation of the face recognition module; and

FIG. 4 is a block diagram of a motion detection system for locating faces within a sequence of images.

STRUCTURE AND OPERATION

Referring to FIG. 1, in an audience monitoring system 2, a video camera 4, which is trained on an area where members of a viewing audience generally sit to watch the TV, sends a sequence of video image frames to a motion detection module 6. Video camera 4, which may, for example, be installed in the home of a family that has volunteered to participate in a study of public viewing habits, generates images of TV viewing audience. Motion detection module 6 processes the sequence of image frames to identify regions of the recorded scene that contain motion, and thus may be evidence of the presence of a person watching TV. In general, motion detection module 6 accomplishes this by comparing successive frames of the image sequence so as to find those locations containing image data that changes over time. Since the image background (i.e., images of the furniture and other objects in the room) will usually remain unchanged from frame to frame, the areas of movement will generally be evidence of the presence of a person in the viewing audience.

When movement is identified, a head locator module 8 selects a block of the image frame containing the movement and sends it to a face recognition module 10 where it is analyzed for the presence of recognizable faces. Face recognition module 10 performs two functions. First, it determines whether the image data within the selected block resembles a face. Then, if it does resemble a face, module 10 determines whether the face is one of a reference set of faces. The reference set may include, for example, the images of faces of all members of the family in whose house the audience monitoring system has been installed.

To perform its recognition functions, face recognizer 10 employs a multi-dimensional representation in which face images are characterized by a set of eigenvectors or "eigenfaces". In general, according to this technique, each image is represented as a vector (or a point) in very high dimensional image space in which each pixel of the image is represented by a corresponding dimension or axis. The dimension of this image space thus depends upon the size of the image being represented and can become very large for any reasonably sized image. For example, if the block of image data is N pixels by N pixels, then the multi-dimensional image space has dimension N2. The image vector which represents the NN block of image data in this multi-dimensional image space is constructed by simply concatenating the rows of the image data to generate a vector of length N2.

Face images, like all other possible images, are represented by points within this multi-dimensional image space. The distribution of faces, however, tends to be grouped within a region of the image space. Thus, the distribution of faces of the reference set can be characterized by using principal component analysis. The resulting principal components of the distribution of faces, or the eigenvectors of the covariance matrix of the set of face images, defines the variation among the set of face images. These eigenvectors are typically ordered, each one accounting for a different amount of variation among the face images. They can be thought of as a set of features which together characterize the variation between face images within the reference set. Each face image location within the multi-dimensional image space contributes more or less to each eigenvector, so that each eigenvector represents a sort of ghostly face which is referred to herein as an eigenface.

Each individual face from the reference set can be represented exactly in terms of a linear combination of M non-zero eigenfaces. Each face can also be approximated using only the M' "best" faces, i.e., those that have the largest eigenvalues, and which therefore account for the most variance within the set of face images. The best M' eigenfaces span an M'-dimensional subspace (referred to hereinafter as "face space") of all possible images.

This approach to face recognition involves the initialization operations shown in FIG. 2 to "train" recognition module 10. First, a reference set of face images is obtained and each of the faces of that set is represented as a corresponding vector or point in the multi-dimensional image space (step 100). Then, using principal component analysis, the distribution of points for the reference set of faces is characterized in terms of a set of eigenvectors (or eigenfaces) (step 102). If a full characterization of the distribution of points is performed, it will yield N2 eigenfaces of which M are non-zero. Of these, only the M' eigenfaces corresponding to the highest eigenvalues are chosen, where M'<M<<N2. This subset of eigenfaces is used to define a subspace (or face space) within the multidimensional image space. Finally, each member of the reference set is represented by a corresponding point within face space (step 104). For a given face, this is accomplished by projecting its point in the higher dimensional image space onto face space.

If additional faces are added to the reference set at a later time, these operations are repeated to update the set of eigenfaces characterizing the reference set.

After face recognition module 10 is initialized, it implements the steps shown in FIG. 3 to recognize face images supplied by face locator module 8. First, face recognition module 10 projects the input image (i.e., the image presumed to contain a face) onto face space by projecting it onto each of the M' eigenfaces (step 200). Then, module 10 determines whether the input image is a face at all (whether known or unknown) by checking to see if the image is sufficiently close to "face space" (step 202). That is, module 10 computes how far the input image in the multi-dimensional image space is from the face space and compares this to a preselected threshold. If the computed distance is greater than the preselected threshold, module 10 indicates that it does not represent a face image and motion detection module 6 locates the next block of the overall image which may contain a face image.

If the computed distance is sufficiently close to face space (i.e., less than the preselected threshold), recognition module 10 treats it as a face image and proceeds with determining whose face it is (step 206). This involves computing distances between the projection of the input image onto face space and each of the reference face images in face space. If the projected input image is sufficiently close to any one of the reference faces (i.e., the computed distance in face space is less than a predetermined distance), recognition module 10 identifies the input image as belonging to the individual associated with that reference face. If the projected input image is not sufficently close to any one of the reference faces, recognition module 10 reports that a person has been located but the identity of the person is unknown.

The mathematics underlying each of these steps will now be described in greater detail.

Calculating Eigenfaces

Let a face image I(x,y) be a two-dimensional N by N array of (8-bit) intensity values. The face image is represented in the multi-dimensional image space as a vector of dimension N2. Thus, a typical image of size 256 by 256 becomes a vector of dimension 65,536, or, equivalently, a point in 65,536-dimensional image space. An ensemble of images, then, maps to a collection of points in this huge space.

Images of faces, being similar in overall configuration, are not randomly distributed in this huge image space and thus can be described by a relatively low dimensional subspace. Using principal component analysis, one identifies the vectors which best account for the distribution of face images within the entire image space. These vectors, namely, the "eigenfaces", define the "face space". Each vector is of length N2, describes an N by N image, and is a linear combination of the original face images of the reference set.

Let the training set of face images be Γ1, Γ2, Γ3, . . . , Γm. The average face of the set is defined by

Ψ=(M)-1 Σn Γn,              (1)

where the summation is from n=1 to M. Each face differs from the average by the vector Φii -Ψ. This set of very large vectors is then subject to principal component analysis, which seeks a set of M orthonormal vectors, un, which best describes the distribution of the data. The kth vector, uk, is chosen such that:

λk =(M)-1 Σn (uk T Φn)2 (2)

is a maximum, subject to: ##EQU1##

The vectors uk and scalars λk are the eigenvectors and eigenvalues, respectively, of the covariance matrix ##EQU2## where the matrix A= Φ1 Φ2 . . . ΦM !. The matrix C, however, is N2 by N2, and determining the N2 eigenvectors and eigenvalues can become an intractable task for typical image sizes.

If the number of data points in the face space is less than the dimension of the overall image space (namely, if, M<N2), there will be only M-1, rather than N2, meaningful eigenvectors. (The remaining eigenvectors will have associated eigenvalues of zero.) One can solve for the N2 -dimensional eigenvectors in this case by first solving for the eigenvectors of an M by M matrix--e.g. solving a 1616 matrix rather than a 16,384 by 16,384 matrix--and then taking appropriate linear combinations of the face images Φi. Consider the eigenvectors vi of AT A such that:

AT Avii vi                       (5)

Premultiplying both sides by A, yields:

AAT Avii Avi                     (6)

from which it is apparent that Avi are the eigenvectors of C=AAT.

Following this analysis, it is possible to construct the M by M matrix L=AT A, where Lmnm T Φn, and find the M eigenvectors, v1, of L. These vectors determine linear combinations of the M training set face images to form the eigenfaces u1 : ##EQU3##

With this analysis the calculations are greatly reduced, from the order of the number of pixels in the images (N2) to the order of the number of images in the training set (M). In practice, the training set of face images will be relatively small (M<<N2), and the calculations become quite manageable. The associated eigenvalues provide a basis for ranking the eigenvectors according to their usefulness in characterizing the variation among the images.

In practice, a smaller M' is sufficient for identification, since accurate construction of the image is not a requirement. In this framework, identification becomes a pattern recognition task. The eigenfaces span an M'-dimensional subspace of the original N2 image space. The M' significant eigenvectors of the L matrix are chosen as those with the largest associated eigenvalues. In test cases based upon M=16 face images, M'=7 eigenfaces were found to yield acceptable results, i.e., a level of accuracy sufficient for monitoring a TV audience for purposes of studying viewing habits and tastes.

A new face image (Γ) is transformed into its eigenface components (i.e., projected into "face space") by a simple operation,

ωk =uk T (Γ-Ψ),              (8)

for k=1, . . . , M'. This describes a set of point-by-point image multiplications and summations, operations which may be performed at approximately frame rate on current image processing hardware.

The weights form a vector ΩT = ω1 ω2 . . . ωM,! that describes the contribution of each eigenface in representing the input face image, treating the eigenfaces as a basis set for face images. The vector may then be used in a standard pattern recognition algorithm to find which of a number of pre-defined face classes, if any, best describes the face. The simplest method for determining which face class provides the best description of an input face image is to find the face class k that minimizes the Euclidian distance

εk =∥(Ω-Ωk)∥2, (9)

where Ωk is a vector describing the kth face class. The face classes Ωi are calculated by averaging the results of the eigenface representation over a small number of face images (as few as one) of each individual. A face is classified as belonging to class k when the minimum εk is below some chosen threshold θ.sub.ε. Otherwise the face is classified as "unknown", and optionally used to create a new face class.

Because creating the vector of weights is equivalent to projecting the original face image onto the low-dimensional face space, many images (most of them looking nothing like a face) will project onto a given pattern vector. This is not a problem for the system, however, since the distance ε between the image and the face space is simply the squared distance between the mean-adjusted input image Φ=Γ-Ψ and Φf=Σωk uk, its projection onto face space (where the summation is over k from 1 to M'):

ε2 =∥Φ-Φf2 (10)

Thus, there are four possibilities for an input image and its pattern vector: (1) near face space and near a face class; (2) near face space but not near a known face class; (3) distant from face space and near a face class; and (4) distant from face space and not near a known face class.

In the first case, an individual is recognized and identified. In the second case, an unknown individual is present. The last two cases indicate that the image is not a face image. Case three typically shows up as a false positive in most other recognition systems. In the described embodiment, however, the false recognition may be detected because of the significant distance between the image and the subspace of expected face images.

Summary of Eigenface Recognition Procedure

To summarize, the eigenfaces approach to face recognition involves the following steps:

1. Collect a set of characteristic face images of the known individuals. This set may include a number of images for each person, with some variation in expression and in lighting. (Say four images of ten people, so M=40.)

2. Calculate the (4040) matrix L, find its eigenvectors and eigenvalues, and choose the M' eigenvectors with the highest associated eigenvalues. (Let M'=10 in this example.)

3. Combine the normalized training set of images according to Eq. 7 to produce the (M'=10) eigenfaces uk.

4. For each known individual, calculate the class vector Ωk by averaging the eigenface pattern vectors Ω (from Eq. 9) calculated from the original (four) images of the individual. Choose a threshold θ.sub.ε which defines the maximum allowable distance from any face class, and a threshold θt which defines the maximum allowable distance from face space (according to Eq. 10).

5. For each new face image to be identified, calculate its pattern vector φ, the distances εi to each known class, and the distance ε to face space. If the distance ε>θt, classify the input image as not a face. If the minimum distance εk ≦θ.sub.ε and the distance ε≦θ1, classify the input face as the individual associated with class vector Ωk. If the minimum distance εk >θε and ε≦θ1, then the image may be classified as "unknown", and optionally used to begin a new face class.

6. If the new image is classified as a known individual, this image may be added to the original set of familiar face images, and the eigenfaces may be recalculated (steps 1-4). This gives the opportunity to modify the face space as the system encounters more instances of known faces.

In the described embodiment, calculation of the eigenfaces is done offline as part of the training. The recognition currently takes about 400 msec running rather inefficiently in Lisp on a Sun 4, using face images of size 128128. With some special-purpose hardware, the current version could run at close to frame rate (33 msec).

Designing a practical system for face recognition within this framework requires assessing the tradeoffs between generality, required accuracy, and speed. If the face recognition task is restricted to a small set of people (such as the members of a family or a small company), a small set of eigenfaces is adequate to span the faces of interest. If the system is to learn new faces or represent many people, a larger basis set of eigenfaces will likely be required.

Motion Detection And Head Tracking

In the described embodiment, motion detection module 6 and head locator module 8 locates and tracks the position of the head of any person within the scene viewed by video camera 4 by implementing the tracking algorithm depicted in FIG. 4. A sequence of image frames 30 from video camera 4 first passes through a spatio-temporal filtering module 32 which accentuates image locations which change with time. Spatio-temporal filtering module 32 identifies the locations of motion by performing a differencing operation on successive frames of the sequence of image frames. In the output of the spatio-temporal filter module 32, a moving person "lights up" whereas the other areas of the image containing no motion appear as black.

The spatio-temporal filtered image passes to a thresholding module 34 which produces a binary motion image identifying the locations of the image for which the motion exceeds a preselected threshold. That is, it locates the areas of the image containing the most motion. In all such areas, the presence of a person is postulated.

A motion analyzer module 36 analyzes the binary motion image to watch how "motion blobs" change over time to decide if the motion is caused by a person moving and to determine head position. A few simple rules are applied, such as "the head is the small upper blob above a larger blob (i.e., the body)", and "head motion must be reasonably slow and contiguous" (i.e., heads are not expected to jump around the image erratically).

The motion image also allows for an estimate of scale. The size of the blob that is assumed to be the moving head determines the size of the subimage to send to face recognition module 10 (see FIG. 1). This subimage is rescaled to fit the dimensions of the eigenfaces.

Using "Face Space" To Locate The Face

Face space may also be used to locate faces in single images, either as an alternative to locating faces from motion (e.g. if there is too little motion or many moving objects) or as a method of achieving more precision than is possible by use of motion tracking alone.

Typically, images of faces do not change radically when projected into the face space; whereas, the projection of non-face images appear quite different. This basic idea may be used to detect the presence of faces in a scene. To implement this approach, the distance ε between the local subimage and face space is calculated at every location in the image. This calculated distance from face space is then used as a measure of "faceness". The result of calculating the distance from face space at every point in the image is a "face map" ε(x,y) in which low values (i.e., the dark areas) indicate the presence of a face.

Direct application of Eq. 10, however, is rather expensive computationally. A simpler, more efficient method of calculating the face map ε(x,y) is as follows.

To calculate the face map at every pixel of an image I(x,y), the subimage centered at that pixel is projected onto face space and the projection is then subtracted from the original subimage. To project a subimage Γ onto face space, one first subtracts the mean image (i.e., Ψ), resulting in Φ=Γ-Ψ. With Φf being the projection of Φ onto face space, the distance measure at a given image location is then: ##EQU4## since Φf ⊥(Φ-Φf). Because Φf is a linear combination of the eigenfaces (Φfi ωi ui) and the eigenfaces are orthonormal vectors,

Φf T Φfi ωi 2 (12)

and

ε2 (x,y)=ΦT (x,y) Φ(x,y)-Σωi 2 (x,y)               (13)

where ε(x,y) and ωi (x,y) are scalar functions of image location, and Φ(x,y) is a vector function of image location.

The second term of Eq. 13 is calculated in practice by a correlation with the L eigenfaces: ##EQU5## where x the correlation operator. The first term of Eq. 13 becomes ##EQU6## Since the average face Ψ and the eigenfaces ui are fixed, the terms ΨT Ψ and Ψxui may be computed ahead of time.

Thus, the computation of the face map involves only L+1 correlations over the input image and the computation of the first term ΓT (x,y)Γ(x,y). This is computed by squaring the input image I(x,y) and, at each image location, summing the squared values of the local subimage.

Scale Invariance

Experiments reveal that recognition performance decreases quickly as the head size, or scale, is mis-judged. It is therefore desirable for the head size in the input image must be close to that of the eigenfaces. The motion analysis can give an estimate of head size, from which the face image is rescaled to the eigenface size.

Another approach to the scale problem, which may be separate from or in addition to the motion estimate, is to use multiscale eigenfaces, in which an input face image is compared with eigenfaces at a number of scales. In this case the image will appear to be near the face space of only the closest scale eigenfaces. Equivalently, the input image (i.e., the portion of the overall image selected for analysis) can be scaled to multiple sizes and the scale which results in the smallest distance measure to face space used.

Other embodiments are within the following claims. For example, although the eigenfaces approach to face recognition has been presented as an information processing model, it may also be implemented using simple parallel computing elements, as in a connectionist system or artificial neural network.

Patent Citations
Cited PatentFiling datePublication dateApplicantTitle
US4636862 *Feb 7, 1985Jan 13, 1987Kokusai Denshin Denwa Kabushiki KaishaSystem for detecting vector of motion of moving objects on picture
US4651289 *Jan 24, 1983Mar 17, 1987Tokyo Shibaura Denki Kabushiki KaishaPattern recognition apparatus and method for making same
US4752957 *Sep 7, 1984Jun 21, 1988Kabushiki Kaisha ToshibaApparatus and method for recognizing unknown patterns
US4838644 *Sep 15, 1987Jun 13, 1989The United States Of America As Represented By The United States Department Of EnergyPosition, rotation, and intensity invariant recognizing method
US4858000 *Sep 14, 1988Aug 15, 1989A. C. Nielsen CompanyImage recognition audience measurement system and method
US4926491 *Jun 6, 1988May 15, 1990Kabushiki Kaisha ToshibaPattern recognition device
US4930011 *Aug 2, 1988May 29, 1990A. C. Nielsen CompanyMethod and apparatus for identifying individual members of a marketing and viewing audience
US4998286 *Jan 20, 1988Mar 5, 1991Olympus Optical Co., Ltd.Correlation operational apparatus for multi-dimensional images
US5031228 *Sep 14, 1988Jul 9, 1991A. C. Nielsen CompanyImage recognition system and method
Non-Patent Citations
Reference
1L. Sirovich et al., 1987 Optical Society of America, "Low-dimensional procedure for the characterization of human faces", pp. 519-524.
2 *L. Sirovich et al., 1987 Optical Society of America, Low dimensional procedure for the characterization of human faces , pp. 519 524.
Referenced by
Citing PatentFiling datePublication dateApplicantTitle
US6445810 *Dec 1, 2000Sep 3, 2002Interval Research CorporationMethod and apparatus for personnel detection and tracking
US6456320 *May 26, 1998Sep 24, 2002Sanyo Electric Co., Ltd.Monitoring system and imaging system
US6501857 *Jul 20, 1999Dec 31, 2002Craig GotsmanMethod and system for detecting and classifying objects in an image
US6529620Sep 12, 2001Mar 4, 2003Pinotage, L.L.C.System and method for obtaining and utilizing maintenance information
US6535620 *Mar 12, 2001Mar 18, 2003Sarnoff CorporationMethod and apparatus for qualitative spatiotemporal data processing
US6597801 *Dec 20, 1999Jul 22, 2003Hewlett-Packard Development Company L.P.Method for object registration via selection of models with dynamically ordered features
US6618490 *Dec 20, 1999Sep 9, 2003Hewlett-Packard Development Company, L.P.Method for efficiently registering object models in images via dynamic ordering of features
US6628811 *Mar 18, 1999Sep 30, 2003Matsushita Electric Industrial Co. Ltd.Method and apparatus for recognizing image pattern, method and apparatus for judging identity of image patterns, recording medium for recording the pattern recognizing method and recording medium for recording the pattern identity judging method
US6628834 *Jul 11, 2002Sep 30, 2003Hewlett-Packard Development Company, L.P.Template matching system for images
US6690414 *Dec 12, 2000Feb 10, 2004Koninklijke Philips Electronics N.V.Method and apparatus to reduce false alarms in exit/entrance situations for residential security monitoring
US6724920Jul 21, 2000Apr 20, 2004Trw Inc.Application of human facial features recognition to automobile safety
US6795567May 5, 2000Sep 21, 2004Hewlett-Packard Development Company, L.P.Method for efficiently tracking object models in video sequences via dynamic ordering of features
US6810135Jun 29, 2000Oct 26, 2004Trw Inc.Optimized human presence detection through elimination of background interference
US6816085Nov 17, 2000Nov 9, 2004Michael N. HaynesMethod for managing a parking lot
US6865296 *Jun 5, 2001Mar 8, 2005Matsushita Electric Industrial Co., Ltd.Pattern recognition method, pattern check method and pattern recognition apparatus as well as pattern check apparatus using the same methods
US6873743Mar 29, 2002Mar 29, 2005Fotonation Holdings, LlcMethod and apparatus for the automatic real-time detection and correction of red-eye defects in batches of digital images or in handheld appliances
US6904168Oct 22, 2001Jun 7, 2005Fotonation Holdings, LlcWorkflow system for detection and classification of images suspected as pornographic
US6904347Jun 29, 2000Jun 7, 2005Trw Inc.Human presence detection, identification and tracking using a facial feature image sensing system for airbag deployment
US6965694 *Nov 27, 2001Nov 15, 2005Honda Giken Kogyo Kabushiki KaisaMotion information recognition system
US6975763 *Jul 11, 2001Dec 13, 2005Minolta Co., Ltd.Shade component removing apparatus and shade component removing method for removing shade in image
US7050084Sep 24, 2004May 23, 2006Avaya Technology Corp.Camera frame display
US7054468Jul 22, 2002May 30, 2006Honda Motor Co., Ltd.Face recognition using kernel fisherfaces
US7068301Apr 24, 2002Jun 27, 2006Pinotage L.L.C.System and method for obtaining and utilizing maintenance information
US7085774Aug 30, 2001Aug 1, 2006Infonox On The WebActive profiling system for tracking and quantifying customer conversion efficiency
US7103215May 7, 2004Sep 5, 2006Potomedia Technologies LlcAutomated detection of pornographic images
US7110570Jul 21, 2000Sep 19, 2006Trw Inc.Application of human facial features recognition to automobile security and convenience
US7188307 *Nov 8, 2001Mar 6, 2007Canon Kabushiki KaishaAccess system
US7227567Sep 14, 2004Jun 5, 2007Avaya Technology Corp.Customizable background for video communications
US7269292Jun 26, 2003Sep 11, 2007Fotonation Vision LimitedDigital image adjustable compression and resolution using face detection information
US7295687 *Jul 31, 2003Nov 13, 2007Samsung Electronics Co., Ltd.Face recognition method using artificial neural network and apparatus thereof
US7315630Jun 26, 2003Jan 1, 2008Fotonation Vision LimitedPerfecting of digital image rendering parameters within rendering devices using face detection
US7317815Jun 26, 2003Jan 8, 2008Fotonation Vision LimitedDigital image processing composition using face detection information
US7331671Mar 29, 2004Feb 19, 2008Delphi Technologies, Inc.Eye tracking method based on correlation and detected eye movement
US7362368Jun 26, 2003Apr 22, 2008Fotonation Vision LimitedPerfecting the optics within a digital image acquisition device using face detection
US7362885Apr 20, 2004Apr 22, 2008Delphi Technologies, Inc.Object tracking and eye state identification method
US7379602Jul 16, 2003May 27, 2008Honda Giken Kogyo Kabushiki KaishaExtended Isomap using Fisher Linear Discriminant and Kernel Fisher Linear Discriminant
US7382903 *Nov 19, 2003Jun 3, 2008Eastman Kodak CompanyMethod for selecting an emphasis image from an image collection based upon content recognition
US7388971Oct 23, 2003Jun 17, 2008Northrop Grumman CorporationRobust and low cost optical system for sensing stress, emotion and deception in human subjects
US7440593Jun 26, 2003Oct 21, 2008Fotonation Vision LimitedMethod of improving orientation and color balance of digital images using face detection information
US7460150Mar 14, 2005Dec 2, 2008Avaya Inc.Using gaze detection to determine an area of interest within a scene
US7466866Jul 5, 2007Dec 16, 2008Fotonation Vision LimitedDigital image adjustable compression and resolution using face detection information
US7471846Jun 26, 2003Dec 30, 2008Fotonation Vision LimitedPerfecting the effect of flash within an image acquisition devices using face detection
US7512571Aug 26, 2003Mar 31, 2009Paul RudolfAssociative memory device and method based on wave propagation
US7564476May 13, 2005Jul 21, 2009Avaya Inc.Prevent video calls based on appearance
US7565030Dec 27, 2004Jul 21, 2009Fotonation Vision LimitedDetecting orientation of digital images using face detection information
US7570785Nov 29, 2007Aug 4, 2009Automotive Technologies International, Inc.Face monitoring system and method for vehicular occupants
US7574016Jun 26, 2003Aug 11, 2009Fotonation Vision LimitedDigital image processing using face detection information
US7616233Jun 26, 2003Nov 10, 2009Fotonation Vision LimitedPerfecting of digital image capture parameters within acquisition devices using face detection
US7620216Jun 14, 2006Nov 17, 2009Delphi Technologies, Inc.Method of tracking a human eye in a video image
US7620218Jun 17, 2008Nov 17, 2009Fotonation Ireland LimitedReal-time face tracking with reference images
US7630527Jun 20, 2007Dec 8, 2009Fotonation Ireland LimitedMethod of improving orientation and color balance of digital images using face detection information
US7634109Oct 30, 2008Dec 15, 2009Fotonation Ireland LimitedDigital image processing using face detection information
US7650034Dec 14, 2005Jan 19, 2010Delphi Technologies, Inc.Method of locating a human eye in a video image
US7652593Oct 5, 2006Jan 26, 2010Haynes Michael NMethod for managing a parking lot
US7660445 *Apr 17, 2008Feb 9, 2010Eastman Kodak CompanyMethod for selecting an emphasis image from an image collection based upon content recognition
US7668304Jan 25, 2006Feb 23, 2010Avaya Inc.Display hierarchy of participants during phone call
US7684630Dec 9, 2008Mar 23, 2010Fotonation Vision LimitedDigital image adjustable compression and resolution using face detection information
US7688225Oct 22, 2007Mar 30, 2010Haynes Michael NMethod for managing a parking lot
US7693311Jul 5, 2007Apr 6, 2010Fotonation Vision LimitedPerfecting the effect of flash within an image acquisition devices using face detection
US7702136Jul 5, 2007Apr 20, 2010Fotonation Vision LimitedPerfecting the effect of flash within an image acquisition devices using face detection
US7706576Dec 28, 2004Apr 27, 2010Avaya Inc.Dynamic video equalization of images using face-tracking
US7809162Oct 30, 2008Oct 5, 2010Fotonation Vision LimitedDigital image processing using face detection information
US7844076Oct 30, 2006Nov 30, 2010Fotonation Vision LimitedDigital image processing using face detection and skin tone information
US7844135Jun 10, 2009Nov 30, 2010Tessera Technologies Ireland LimitedDetecting orientation of digital images using face detection information
US7848549Oct 30, 2008Dec 7, 2010Fotonation Vision LimitedDigital image processing using face detection information
US7853043Dec 14, 2009Dec 14, 2010Tessera Technologies Ireland LimitedDigital image processing using face detection information
US7855737Mar 26, 2008Dec 21, 2010Fotonation Ireland LimitedMethod of making a digital camera image of a scene including the camera user
US7860274Oct 30, 2008Dec 28, 2010Fotonation Vision LimitedDigital image processing using face detection information
US7864990Dec 11, 2008Jan 4, 2011Tessera Technologies Ireland LimitedReal-time face tracking in a digital image acquisition device
US7912245Jun 20, 2007Mar 22, 2011Tessera Technologies Ireland LimitedMethod of improving orientation and color balance of digital images using face detection information
US7916897Jun 5, 2009Mar 29, 2011Tessera Technologies Ireland LimitedFace tracking for controlling imaging parameters
US7916971May 24, 2007Mar 29, 2011Tessera Technologies Ireland LimitedImage processing method and apparatus
US7953251Nov 16, 2010May 31, 2011Tessera Technologies Ireland LimitedMethod and apparatus for detection and correction of flash-induced eye defects within digital images using preview or other reference images
US8005265Sep 8, 2008Aug 23, 2011Tessera Technologies Ireland LimitedDigital image processing using face detection information
US8031914Oct 11, 2006Oct 4, 2011Hewlett-Packard Development Company, L.P.Face-based image clustering
US8050465Jul 3, 2008Nov 1, 2011DigitalOptics Corporation Europe LimitedReal-time face tracking in a digital image acquisition device
US8055029Jun 18, 2007Nov 8, 2011DigitalOptics Corporation Europe LimitedReal-time face tracking in a digital image acquisition device
US8055090Sep 14, 2010Nov 8, 2011DigitalOptics Corporation Europe LimitedDigital image processing using face detection information
US8064653Nov 29, 2007Nov 22, 2011Viewdle, Inc.Method and system of person identification by facial image
US8135184May 23, 2011Mar 13, 2012DigitalOptics Corporation Europe LimitedMethod and apparatus for detection and correction of multiple image defects within digital images using preview or other reference images
US8155397Sep 26, 2007Apr 10, 2012DigitalOptics Corporation Europe LimitedFace tracking in a camera processor
US8155401Sep 29, 2010Apr 10, 2012DigitalOptics Corporation Europe LimitedPerfecting the effect of flash within an image acquisition devices using face detection
US8160312Sep 29, 2010Apr 17, 2012DigitalOptics Corporation Europe LimitedPerfecting the effect of flash within an image acquisition devices using face detection
US8165282Aug 30, 2006Apr 24, 2012Avaya Inc.Exploiting facial characteristics for improved agent selection
US8213737Jun 20, 2008Jul 3, 2012DigitalOptics Corporation Europe LimitedDigital image enhancement with reference images
US8224039Sep 3, 2008Jul 17, 2012DigitalOptics Corporation Europe LimitedSeparating a directional lighting variability in statistical face modelling based on texture space decomposition
US8243182Nov 8, 2010Aug 14, 2012DigitalOptics Corporation Europe LimitedMethod of making a digital camera image of a scene including the camera user
US8251597Oct 15, 2010Aug 28, 2012Wavecam Media, Inc.Aerial support structure for capturing an image of a target
US8270674Jan 3, 2011Sep 18, 2012DigitalOptics Corporation Europe LimitedReal-time face tracking in a digital image acquisition device
US8320641Jun 19, 2008Nov 27, 2012DigitalOptics Corporation Europe LimitedMethod and apparatus for red-eye detection using preview or other reference images
US8326066Mar 8, 2010Dec 4, 2012DigitalOptics Corporation Europe LimitedDigital image adjustable compression and resolution using face detection information
US8330831Jun 16, 2008Dec 11, 2012DigitalOptics Corporation Europe LimitedMethod of gathering visual meta data using a reference image
US8345114Jul 30, 2009Jan 1, 2013DigitalOptics Corporation Europe LimitedAutomatic face and skin beautification using face detection
US8379917Oct 2, 2009Feb 19, 2013DigitalOptics Corporation Europe LimitedFace recognition performance using additional image features
US8384793Jul 30, 2009Feb 26, 2013DigitalOptics Corporation Europe LimitedAutomatic face and skin beautification using face detection
US8385610Jun 11, 2010Feb 26, 2013DigitalOptics Corporation Europe LimitedFace tracking for controlling imaging parameters
US8433050Feb 6, 2006Apr 30, 2013Avaya Inc.Optimizing conference quality with diverse codecs
US8494232Feb 25, 2011Jul 23, 2013DigitalOptics Corporation Europe LimitedImage processing method and apparatus
US8494286Feb 5, 2008Jul 23, 2013DigitalOptics Corporation Europe LimitedFace detection in mid-shot digital images
US8498452Aug 26, 2008Jul 30, 2013DigitalOptics Corporation Europe LimitedDigital image processing using face detection information
US8503800Feb 27, 2008Aug 6, 2013DigitalOptics Corporation Europe LimitedIllumination detection using classifier chains
US8509496Nov 16, 2009Aug 13, 2013DigitalOptics Corporation Europe LimitedReal-time face tracking with reference images
US8509561Feb 27, 2008Aug 13, 2013DigitalOptics Corporation Europe LimitedSeparating directional lighting variability in statistical face modelling based on texture space decomposition
US8515138May 8, 2011Aug 20, 2013DigitalOptics Corporation Europe LimitedImage processing method and apparatus
US8577616Dec 16, 2004Nov 5, 2013Aerulean Plant Identification Systems, Inc.System and method for plant identification
US8593542Jun 17, 2008Nov 26, 2013DigitalOptics Corporation Europe LimitedForeground/background separation using reference images
US8649604Jul 23, 2007Feb 11, 2014DigitalOptics Corporation Europe LimitedFace searching and detection in a digital image acquisition device
US8675991Jun 2, 2006Mar 18, 2014DigitalOptics Corporation Europe LimitedModification of post-viewing parameters for digital images using region or feature information
US8682097Jun 16, 2008Mar 25, 2014DigitalOptics Corporation Europe LimitedDigital image enhancement with reference images
Classifications
U.S. Classification382/118, 382/204, 382/201
International ClassificationG07C9/00, A61B5/117, H04N7/28, H04H60/59, G06K9/62, H04N7/26, G06K9/00, H04H60/45, H04H1/00, H04H60/56
Cooperative ClassificationH04N19/00387, H04N19/00, H04N21/42201, H04N19/00963, G06K9/00241, H04H60/45, H04H60/56, G06K9/6232, G06K9/6247, G06K9/00228, A61B5/1176, G07C9/00158, H04H60/59, G06K9/00275
European ClassificationH04N21/422B, G07C9/00C2D, A61B5/117F, H04N7/26, G06K9/00F2H, H04N7/28, G06K9/00F1H, G06K9/62B4P, G06K9/62B4, H04N7/26J4, G06K9/00F1, H04H60/45, H04H60/56
Legal Events
DateCodeEventDescription
Jul 25, 2005PRDPPatent reinstated due to the acceptance of a late maintenance fee
Effective date: 19990112
Jun 30, 2005FPAYFee payment
Year of fee payment: 12
Jun 30, 2005SULPSurcharge for late payment
Aug 24, 2000FPAYFee payment
Year of fee payment: 8
Aug 24, 2000SULPSurcharge for late payment