|Publication number||US6154559 A|
|Application number||US 09/164,734|
|Publication date||Nov 28, 2000|
|Filing date||Oct 1, 1998|
|Priority date||Oct 1, 1998|
|Also published as||EP0990416A1|
|Publication number||09164734, 164734, US 6154559 A, US 6154559A, US-A-6154559, US6154559 A, US6154559A|
|Inventors||Paul Anthony Beardsley|
|Original Assignee||Mitsubishi Electric Information Technology Center America, Inc. (Ita)|
|Export Citation||BiBTeX, EndNote, RefMan|
|Patent Citations (10), Referenced by (164), Classifications (22), Legal Events (7)|
|External Links: USPTO, USPTO Assignment, Espacenet|
This invention relates to the classification of gaze direction for an individual, and more particularly to a qualitative approach which involves automatic identification and labeling of frequently occurring head poses by means of a pose-space histogram, without any need for accurate camera calibration or explicit 3D metric measurements of the surrounding environment, or of the individual's head and eyes.
In the past, the classification of the gaze direction of a vehicle driver has been important to determine, amongst other things, a drowsy driver. Moreover gaze detection systems, in conjunction with external sensors such as infra-red, microwave or sonar ranging to ascertain obstacles in the path of the vehicle, are useful to determine if a driver is paying attention to possible collisions. If the driver is not paying attention and is rather looking away from the direction of travel, it is desirable to provide an alarm of an automatic nature. Such automatic systems are described in "Sounds and scents to jolt noisy drivers", Wall Street Journal, page B1, May 3, 1993. Furthermore, more sophisticated systems might attempt to learn the characteristic activity of a particular driver prior to maneuvers, enabling anticipation of those maneuvers in the future.
An explicit quantitative approach to this problem involves (a) calibrating the camera used to observe the driver, modeling the interior geometry of the car, and storing this as a priori information, and (b) making an accurate 3D metric computation of the driver's location, head pose and gaze direction. Generating a 3D ray for the driver's gaze direction in the car coordinate frame then determines what the driver is looking at.
There are problems with this approach. Firstly, although the geometry of the car's interior will usually be known from the manulacturer's design data, the camera's intrinsic parameters, such as focal length, and extrinsic parameters, such as location and orientation relative to the car coordinate frame, need to be calibrated. That extrinsic calibration may change over time due to vibration. Furthermore, the location of the driver's head, head pose and eye direction must be computed in the car coordinate frame at run-time. This is difficult to do robustly, and is intensive for the typical low-power processor installed in a car.
Acknowledging these difficulties, an alternative approach is adopted which avoids altogether the need for accurate camera calibration, accurate 3D measurements of the geometry of the surrounding environment, or 3D metric measurements of driver location, head and eye pose. The term "qualitative" as used herein indicates that there is no computation of 3D metric measurements, such as distances and angles for the individual's head. In the subject system, the measurements carried out are accurate and repeatable and fulfill the required purpose of classifying gaze direction.
In one embodiment, each possible head pose of the individual is associated with a bin in a "pose-space histogram". Typically the set of all possible head poses, arising from turning the head left to right, up and down, and tilting sideways, maps to a few hundred different bins in the histogram. A method is provided below for efficiently matching an observed head pose of the driver to its corresponding histogram bin, without explicit 3D metric measurements of the individual's location and head pose. Each time an observed head pose is matched to a histogram bin, the count in that bin is incremented. Over an extended period of time, the histogram develops peaks which indicate those head poses which are occurring most frequently. For a car driver, peaks can be expected to occur for the driver looking toward the dashboard, at the mirrors, out of the side window, or straight-ahead. It is straightforward to label these peaks from a qualitative description of the relative location of dashboard, mirrors, side window and windscreen. The gaze direction of the driver in all subsequent images is then classified by measuring head pose and checking whether it is close to a labelled peak in the pose-space. The basic scheme is augmented, when necessary, by processing of eye pose as described further below.
In one embodiment, there are five components to the system--(a) initialization, (b) a fast and efficient method for associating an observed head pose with a number of possible candidate "templates", which are synthetically generated images showing various poses of the head, followed by (c) a refinement method which associates the observed head pose with a unique template, (d) augmentation of the head pose processing with eye pose processing if necessary, and (e) classification of the gaze direction.
Initialization employs a generic head model to model the individual's head. A generic head model, as used herein, refers to a digital 3D model in the computer database with the dimensions of an average person's head. For the initialization phase, the camera is trained on the individual as the head is moved during a typical driving scenario. The subject system automatically records the texture or visual appearance of the individual's face, such as skin color, hair color, the appearance of the eyebrows, eyes, nose, and ears, and the appearance of glasses, a beard or a moustache. The texture is used in conjunction with the generic head model to generate an array of typically several hundred synthetic templates which show the appearance of the individual's head over a range of all the head poses which might occur. The synthetic templates include head poses which were not seen in the original images. Corresponding to each template in the template array, there is a bin in the pose-space histogram. All bins in the histogram are initialized to zero.
Four main pieces of information are computed and stored for each template. Firstly, the position of the eyes on the generic head model is known a priori, hence the position of the eye or eyes is known in each template. Secondly, the skin region in the templates is segmented, or identified, using the algorithm in "Finding skin in color images" by R. Kjeldsen et al, 2nd Intl Conf on Automatic Face and Gesture Recognition, 1996. Thirdly, a 1D projection is taken along the horizontal direction of each template, with this projection containing a count of the number of segmented skin pixels along each row. Fourthly, moments of the skin region are computed to provide a "shape signature" for the skin region. Examples of moments are described in "Visual Pattern Recognition by Moment Invariants" by M. K. Hu, IEEE Trans Inf Theory, IT-8, 1962. For clarity below, the segmented skin region of a template will be referred to as TS, and the 1D projection will be referred to as T1D.
As to fast matching of observed head pose to several possible candidate templates, a newly acquired image of an individual is processed by segmenting the skin region, and creating a 1D projection of the segmented region along the horizontal direction. The segmented region is labeled RS and the 1D projection is labelled R1D.
The first part of the processing deals with the problem that the segmented region RS may be displaced from the image center. Furthermore, region RS contains the head and may also contain the neck, but the segmented skin region TS in a template contains the individual's head only and not the neck. In order to do a comparison between RS and TS, the position of the face must be localized and the neck part must be discarded from the comparison.
Consider the comparison between the acquired image and one specific template. The horizontal offset between the center of gravit3, COG, of the region RS and the COG of the region TS is taken as the horizontal offset which is most likely to align the face in the image and template.
Then 1D projection R1D for the acquired image is compared with the 1D projection T1D for the template using the comparison method in "Color Constant Color Indexing" by B. V. Funt and G. D. Fillayson, PAMI, Vol 17, No 5, 1995, for a range of offsets between R1D and T1D. The offset at which R1D is most similar to T1D indicates the vertical offset between the acquired image and the template which is most likely to align the face in the acquired image and template. Thus the face has now also been localized in the vertical direction and the remaining processing will be carried out in such a way that the individual's neck, if present, is disregarded.
With the acquired image aligned with the template using the computed horizontal and vertical offsets, the moments of the parts of skin region RS in the acquired image which overlap the skin region TS in the template are computed. A score, labelled S, is computed for the measure of similarity between the moments for region RS and the moments for region TS. This process is repeated for every template to generate a set of scores S1, S2. . . Sn. The templates are ranked according to score and a fixed percentage of the top-scoring templates is retained. The remaining templates are discarded for the remainder of the processing on the current image.
As to matching of observed head pose to a unique template, the previous section described the use of segmented skin regions and their shape, based on moments, to identify the most likely matching templates. Processing now returns to the raw color pixel data, including all pixels, skin and non-skin.
Consider the comparison between the acquired image and one specific template from the set of surviving candidates. The position of the face in the acquired image which is most consistent with the face in the template has already been determined, as described in the localization process above. At this position, the acquired image is compared with the synthetic template using cross-correlation of the gradients of the image color, or "image color gradients". This generates a score for the similarity between the individual's head in the acquired image and the synthetic head in the template.
This is repeated for all the candidate templates, and the best score indicates the best-matching template. The histogram bin corresponding to this template is incremented. It will be appreciated that in the subject system, the updating of the histogram, which will subsequently provide information about frequently occurring head poses, has been achieved without making any 3D metric measurements such as distances or angles for the head location or head pose.
Note that the cross-correlation used in this stage is computationally intensive making it difficult to achieve real-time processing when comparing an image with hundreds of templates. By first carrying out the fast culling process described previously to eliminate templates which are unlikely to match, cross-correlation can be incorporated at this stage while still achieving real-time processing.
As to processing of eye pose, head pose alone does not always determine gaze direction. But for the illustrative application here, the head pose is often a good indicator of the driver's focus of attention. For instance, looking at the side or rear-view mirrors always seems to involve the adoption of a particular head pose. The situation is different when one wishes to discriminate between a driver looking straight-ahead or looking towards the dashboard, since this may involve little or no head motion. To deal with the latter case, the subject system further utilizes a qualitative classification of eye direction to indicate whether the eyes are directed straight-ahead or downward. This processing is only applied if necessary. For the illustrative application, it is only applied if the driver's head pose is straight-ahead in which case it is necessary to discriminate between eyes straight-forward and eyes downward.
At this stage, the acquired image has been matched to one of the templates, and since the position of the eye or eyes in this template is known, the position of the eyes in the acquired image is also known. The algorithm for processing the eyes is targetted at the area around their expected position and not over the whole image.
Skin pixels have already been identified in the acquired, image. In the area around the eye, the skin segmentation is examined. Typically, for an individual without glasses, the segmented skin region will have two "holes" corresponding to non-skin area, one for the eyebrow and one for the eye, and the lower of these two holes is taken. This hole has the same shape as the outline of the eye. The "equivalent rectangle" of this shape is computed. An algorithm to do this is described in "Computer Vision for Interactive Computer Games" by Freeman et al, IEEE Computer Graphics and Applications, Vol 18, No 3, 1998. The ratio of the height of this rectangle to its width provides a measure of whether the eye is directed straight-forward or downward. This is because looking downward causes the eyelid to drop, hence narrowing the outline of the eye, and causing the height of the equivalent rectangle to decrease. A fixed threshold is applied to the height/width ratio to decide if the eye is directed straight-forward or downward.
As to classification of the gaze direction, the time history of the observed head behavior is recorded in the pose-space histogram in the following way. As already described, each time an image is matched to its most similar template, the element in the histogram corresponding to that template is incremented. Note, in one embodiment, there is one histogram element corresponding to each template. Peaks will appear in the histogram for the most frequently adopted head poses, and hence for the most frequently recurring view directions. Each peak is labelled using qualitative or approximate information about the geometry of the vehicle around the driver.
The gaze direction of the driver in any subsequent image is classified by matching that image with a template, finding the corresponding element in the histogram, and checking for a nearby labelled peak. In some cases, this must be augmented with information about the eye pose e.g. in the illustrative car driver application, if the driver's head pose is straight-ahead, the eye pose is processed to determine if the driver is looking straight-forward or downward.
To enhance the basis scheme to distinguish between roving motions of the gaze and fixations of the gaze, the duration that the individual holds a particular head pose is taken into account before updating the pose-space histogram.
Thus, it is possible without accurate a priori knowledge about camera calibration or accurate measurements of the environment, and without metric measurements of the head and eyes, to classify gaze direction in real-time. In the automotive alarm application, this permits the generation of appropriate alarms or cues. While the subject system is described in terms of an automotive alarm application, other applications such as time-and-motion studies, observing hospital patients, and determining activity in front of a computer monitor are within the scope of this invention. The algorithms used are appropriate for the Artificial Retina (AR) camera from Mitsubishi Electric Corporation, as described in "Letters to Nature", Kyuma et al, Nature, Vol 372, p. 197, 1994.
In summary, a system is provided to classify the gaze direction of an individual observing a number of surrounding objects. The system utilizes a qualitative approach in which frequently occurring head poses of the individual are automatically identified and labelled according to, their association with the surrounding objects. In conjunction with processing of eye pose, this enables the classification of gaze direction.
In one embodiment, each observed head pose of the individual is automatically associated with a bin in a "pose-space histogram". This histogram records the frequency of different head poses over an extended period of time. Given observations of a car driver, for example, the pose-space histogram develops peaks over time corresponding to the frequently viewed directions of toward the dashboard, toward the mirrors, toward the side window, and straight-ahead. Each peak is labelled using a qualitative description of the environment around the individual, such as the approximate relative directions of dashboard, mirrors, side window, and straight-ahead in the car example. The labelled histogram is then used to classify the head pose of the individual in all subsequent images. This head pose processing is augmented with eye pose processing, enabling the system to rapidly classify gaze direction without accurate a priori information about the calibration of the camera utilized to view the individual, without accurate a priori 3D measurements of the geometry of the environment around the individual, and without any need to compute accurate 3D metric measurements of the individual's location, head pose or eye direction at run-time.
These and other features of the subject invention will be better understood with respect to the Detailed Description taken in conjunction with the Drawings, of which:
FIG. 1A is a diagrammatic representation of the initialization stage for the subject system in which head templates and pose-space histograms are generated, with each head template having associated "shape signature" information, and with the shape signature consisting of (a) a region of the template which has been segmented, or identified, as skin, (b) a 1D projection along the horizontal direction, which gives the number of segmented skin pixels along each row of the template, and (c) a set of moments for the segmented skin region;
FIG. 1B is a flow chart illustrating how head templates are made;
FIG. 2A is a diagrammatic representation of the run-time processing for matching an acquired image to a template, and incrementing the corresponding element in the pose-space histogram;
FIG. 2B is a flow chart illustrating the steps for identifying the appropriate head template given an input image and updating the corresponding element in the pose-space histogram of FIG. 2A;
FIG. 3A is a diagrammatic representation of the gaze classification, which takes place after the system has been running for a short duration, in which peaks in the pose-space histogram are labelled according to their association with objects of interest in the surrounding environment, with any subsequent image thereafter being classified by matching to a template, finding the corresponding element in the pose-space histogram and checking for a nearby labelled peak;
FIG. 3B is a flow chart illustrating the checking of the location in the pose-space histogram for the occurrence of the same head template over a short period of time to classify gaze direction;
FIG. 3C is a series of illustrations showing eye segmentation, or identification, and generation of an equivalent rectangle which is used to identify the eye direction, straight-forward or down;
FIGS. 4A-E are a series of images describing initialization, template generation, image matching with a template, eye pose computation, and the recording of the head pose in a pose space histogram; in which an image of the individual is aligned with the generic head model; the templates are generated; an image of the individual is processed to determine the template most similar in appearance; the eye pose is processed; and the element in the pose-space histogram corresponding to the matched template is incremented, leading over time to the development of peaks in the pose-space histogram which indicate the most frequently adopted head poses of the individual;
FIG. 5 is a series of acquired images of an individual, plus a small number of example templates from the full set of templates generated from the acquired images, with the templates showing different synthetically generated rotations of the head;
FIG. 6 is a typical acquired image together with the error surface generated by matching the image against each template, with darker areas indicating lower residuals and thus better matching, and with the error surface being well-behaved, with a clear minimum at the expected location;
FIG. 7 is a series of image sequences for different subjects with the computed head motion, based on the matched template, shown by the 3D model beneath each image, also showing resilience to strong illumination gradients on the face, specularities on glasses, and changing facial expression;
FIG. 8 is a series of images and corresponding pose-space histogram for three head poses adopted repeatedly over an extended sequence, with the histogram showing three distinct peaks as lightened areas;
FIG. 9 is a series of images showing three samples from a driving sequence, with the corresponding pose-space histogram showing a peak as a lightened area for the driver looking straight-forward, and side lobes as slightly darker areas corresponding to the individual viewing the side and rear-view mirrors;
FIGS. 10 is a series of images showing how directing the gaze downward results in dropping of the eyelid which obscures a clear view of the iris and pupil, with the measurement of the dropping of the eyelid used to classify whether a driver is looking straight-forward or at the dashboard;
FIGS. 11A-E are images showing processing in the region of the eye, segmenting out non-skin areas, retaining the lower area, and replacing that area with an equivalent rectangle; and
FIGS. 12A-B are a series of images showing how dropping of the eyelid narrows the segmented area so that, by means of the equivalent rectangle as illustrated in FIG. 11, it is possible to classify eye direction as straight-forward or downward.
Referring now to FIG. 1A, an individual 10 is shown seated in front of a windshield 12 of a car 14 having a mirror 16 at the center of the windshield and a side mirror 18 as illustrated. Also illustrated is an instrument cluster 20 with the individual either staring straight-ahead as indicated by dotted line 22, toward mirror 16 as illustrated by dotted line 24, towards side mirror 18 as illustrated by dotted line 26, or toward the instrument cluster as illustrated by dotted line 28.
A camera 30 is trained on the individual 10 and supplies images to a CPU 32 which is coupled to a computer database containing a digital generic head model 34. The processing by CPU 32 provides a number of templates, here illustrated at 31, each a synthetically generated image showing showing the appearance of the individual's head for a specific head pose. The templates are generated through the operation of the generic head model in concert with the texture obtained from the images of the individual. In one embodiment, a number of shape signatures, such as segmented skin region together with 1D projections and moments of the region, are used to characterize the skin region of the template to permit rapid matching. A pose-space histogram is initialized with one element corresponding to each template, and all elements initialized to zero.
Referring now to FIG. 1B, the steps utilized to generate the templates are illustrated. Here as a first step, camera 30 observes the individual as the individual adopts fronto-parallel and sideways-facing views relative to the camera. The term "fronto-parallel" is used herein to mean that the face is directed straight into the camera. Facial texture in terms of visual appearance is extracted in a conventional manner. Thereafter, as illustrated in 54, the facial texture is used along with the generic head model to generate the templates for a wide range of possible orientations of the head.
Referring now to FIG. 2A, camera 30 is utilized to capture an image of the individual, with CPU 32 determining which of the templates 34 is most similar in appearance to that of the face of individual 10 as recorded by camera 30.
Referring now to FIG. 2B, a series of steps is performed when matching an image to its most similar template. Here, as illustrated at 70, one takes the image at camera 30 and as illustrated at 72 identifies the skin area. The reason that this is done is to be able to detect the form of the face which is easily recognizable, without having to consider non-skin areas such as the eyeball, teeth, and hair. Thereafter, as illustrated at 74, a signature is generated for this skin area. The signature in one embodiment is a compact representation of the shape of the skin area, which makes possible rapid matching of the image from camera 30 to the templates.
As illustrated at 76, templates with similar signatures are found in a matching process in which the shape signature of the image is compared with the shape signature of each template, and similar signatures are identified. As illustrated at 78, for these similar templates, a cross-correlation of image color gradients is performed between the image and each template to find the most similar template. Having ascertained the template which most closely corresponds, the corresponding bin in the pose-space histogram 60 is incremented as illustrated at 82.
Referring now to FIG. 3A, after the system has been running for a short duration to allow the development of peaks corresponding to frequent gaze directions in the pose-space histogram, these peaks are automatically detected and labelled according to their association with objects of interest in the surrounding environment. In the illustrative car driver application, the peaks correspond to viewing the the dashboard, the mirrors, or straight-ahead.
Referring now to FIG. 3B, gaze classification takes place by processing an acquired image of the individual in the same way as in FIG. 2B, but as a final step, and as illustrated at 106, if the same head template is matched for a short duration, the subject system checks the corresponding location in the pose-space histogram, and the viewing direction is classified according to the closest labelled peak in the histogram. The result is a determination that the individual is looking in a direction corresponding to a direction in which he frequently gazes. Thus, without actual 3D metric measurements, such as distance and angle, of head position or eye position, one can ascertain the gaze direction without having to know anything about either the individual or his environment.
Referring now to FIG. 3C, some head poses are not sufficient on their own to classify the gaze direction. In this case, extra processing is carried out on the eye direction. The segmented eye 90 in the acquired image is examined and fitted with an equivalent rectangle 92 which gives a measure of whether the eye is directed straight-forward or downward. In the illustrative car driver application, If the head pose is straight-ahead, the eye pose is examined. if the eye pose is also straight-ahead, the gaze is classified as straight-ahead. If the eye pose is downward, the gaze is classified as toward the dashboard.
In one embodiment of the subject invention, the characterization of a face is accomplished using an ellipsoid such as described by Basu, Essa, and Pentland in a paper entitled "Motion Regularization for Model-Based Head Tracking", 13th Int Conf on Pattern recognition, Vienna 1996. In another embodiment, the subject invention characterizes the face using the aforementioned generic head model as described in "Human Face Recognition: From Views to Models--From Models to Views", Bichsel, 2nd Intl Conf on Face and Gesture Recognition, 1996.
More particularly as to processing head pose, as to initialization, initialization involves the creation of a 3D coordinate frame containing the camera and a 3D model of the head, consistent with the physical setup. FIG. 4A shows a reference image of the subject at left, which is cropped to the projection of the 3D model at right.
As to generating a template, once the coordinate frame containing the camera and the 3D model has been initialized, it is possible to generate a synthetic view of the face consistent with any specified rotation of the head. This is effectively done by backprojecting image texture from the reference image, as in FIG. 4B at left, onto the 3D model, and reprojecting using a camera at a different location, as in FIG. 4B at right In practice, the reprojection takes place directly between the images. As can be seen in FIG. 4C, images of a subject are matched with the most similar template.
Here an image is matched to the most similar template, namely that image shown to the right. The template is one which is formed as illustrated in FIG. 4B.
In order to further define the gaze direction, it is important to classify eye pose. As shown in FIG. 4D, the eye of the subject is segmented, and an "equivalent rectangle" is generated. This rectangle is useful in specifying whether the gaze direction is straight-ahead or downwards towards, for instance, an automobile dashboard.
As can be seen in FIG. 4E, the system records head pose in a pose-space histogram, recorded for an automobile driver. A bright spot to the left of the figure indicates the driver looked left. If the bright spot is not only left but is below the horizontal center line, one can deduce that the driver is looking at a lower side mirror. If the bright spot is in the center, then it can be deduced that the driver is looking straight-ahead. If the bright spot is upwards and to the right, one can deduce that the driver is looking upwardlt towards the rear-view mirror. In this manner, the pose-space histogram provides a probabilistic indication of the gaze direction of the driver without having to physically measure head position or eye direction.
Referring now to FIG. 5, three images of a subject are used to generate a set of typically several hundred templates showing the subject from a variety of viewpoints. Some example templates are shown in FIG. 5 illustrating the subject looking right, towards the center, and left, both upwardly, straight-ahead, and downward. Two types of 3D model have been investigated--an ellipsoid as described in "Motion regularization for model-based head tracking" by S. Basu et al, 13th Int'l Conference on Pattern Recognition, Vienna, 1996, and a generic head model as described in "Head Pose Determination from One Image Using a Generic Model" by I. Shimizu et al, 3rd Intl Conf on Face and Gesture Recognition, 1998. A generic head model was used to generate the views in FIG. 5. The advantage of the ellipsoid model is that it allows quick initialization of many templates, of the order of seconds for 200 templates of 32×32 resolution, and minor misalignments of the ellipsoid with the reference image have little effect on the final result. The generic head model requires more careful alignment but it clearly gives more realistic results and this improves the quality of the processing which will be described subsequently. Some artifacts, visible in FIG. 5, occur because the generic head model is only an approximation to the actual shape of the subject's face.
Template generation is done offline at initialization time. It is carried out for a range of rotations, around the horizontal axis through the ears, and the vertical axis through the head, to generate an array of templates. Typically we use ±35° and ±60° around the horizontal axis and vertical axes respectively and generate an 11×17 array. The example in FIG. 5 shows a small selection of images taken from the full array. Cyclorotations of the head are currently ignored because these are relatively uncommon motions, and there is in any case some resilience in the processing to cyclorotation.
FIG. 6 shows a typical target image together with the error surface generated by matching the target against each image in the array of templates. The error surface is often well-behaved, as shown here. The horizontal elongation of the minimum occurs because the dominant features in the matching process are the upper hairline, the eyes, and the mouth, all horizontally aligned features so that horizontal offsets have smaller effect on the matching score in equation (1) than vertical offsets.
FIG. 7 shows tracking for a number of different subjects. For each image, the best-matching template has been found, and a 3D head model is illustrated with pose given by the pose angles which were used to generate that template.
As to processing eye pose, work on processing eye pose has been targeted at one specific task, which is the discrimination of whether a car driver is looking straight forward or at the dashboard, since head pose alone is insufficient for this discrimination in most subjects. The approach is again qualitative, avoiding explicit computation of 3D euclidean rays for eye direction.
FIG. 8 shows the result of an experiment in which the subject repeatedly views three different locations over an extended period, with a short pause at each location. The three locations correspond to the rear-view mirror, the side-mirror, and straight-ahead for a car driver. The pose-space histogram shows distinctive peaks for each location.
FIG. 9 shows the pose-space histogram for a short video sequence of a driver in a car. There is a peak for the straight-ahead viewing direction, and lobes to the left and right correspond to the driver looking at the side and rear-view mirrors.
While active systems which use reflected infra-red are able to identify the location of the pupil very reliably, this is a more difficult measurement in a passive system, particularly when the gaze direction is directed downward. FIG. 10 shows how directing the gaze downward results in dropping of the eyelid, which obscures a clear view of the iris and pupil. The approach below uses the dropping of the eyelid to classify whether a driver is looking straight-forward or at the dashboard.
In FIG. 11, the current head pose is known at this stage, obtained via the processing in FIG. 2. Thus the approximate location of the eye is also known, and an algorithm to segment the eye is targetted to the appropriate part of the image, as shown in FIG. 11A. The segmentation in FIG. 11B and C is achieved using the Color Predicate scheme described in "Finding skin in color images" by R. Kjeldsen et al, 2nd Intl Conf on Automatic Face and Gesture Recognition, 1996. In this approach, training examples of skin and non-skin colors are used to label each element in a quantized color space. Kjeldsen found that the same Color Predicate could be used to segment skin in many human subjects. In this work so far a new Color Predicate is generated for each subject. In the first stage of segmentation, each pixel in the target area is labelled as skin or non-skin, regions of connected non-skin pixels are generated, and tiny non-skin regions, if any, are discarded. Typically two large non-skin regions are detected, for the eye and the eyebrow as shown in FIG. 11B. The eye is selected as the region which is physically lowest in the target area, FIG. 11C.
The warping in FIG. 11D is intended to generate the appearance of the eye for a face which is fronto-parallel to the camera, thus factoring out perspective effects. In the general case, this warping is derived from two pieces of information, the rotation of the head which makes the face fronto-parallel to the camera, known from the estimated head pose, and the 3D shape around the eye. To avoid the latter requirement, the area around the eye is assumed locally planar with normal equal to the forward direction of the face. The warping can then be expressed as a planar projectivity. This is straightforward to derive from the required head motion.
The equivalent rectangle of the segmented shape is shown in FIG. 11E. This representation was used in "Computer Vision for Interactive Computer Games" by Freeman et al, IEEE Computer Graphics and Applications, Vol 18, No 3, 1998, to analyze hand gestures. The segmented image is treated as a binary image, and the segmented shape is replaced with a rectangle which has the same moments up to second order. The ratio of height to width of the equivalent rectangle gives a measure of how much the eyelid has dropped. A fixed threshold is applied to this ratio to classify a driver's eye direction as forward or toward the dashboard. FIG. 12 shows an example of the narrowing of the segmented area as the eyelid drops.
Of course, the dropping of the eyelid occurs during blinking as well as for downward gaze direction. The two cases can be differentiated by utilizing the duration of the eye state, since blinking is transitory but attentive viewing has a longer timespan.
As to matching against templates, processing a target image of the driver involves comparing that image with each of the templates to find the best match, see FIG. 4C. A culling process is first carried out based on the shape signature e.g. 1D projection and moments, of the segmented skin area in the target image and templates, to eliminate templates which are clearly a poor match.
For the surviving templates, consider a target image I which is being matched against a template S. The goodness of match M between the two is found by computing
M=Σ1-cos(I.sub.d (i,j)-S.sub.d (i, j)) (1)
where Id (i,j), Sd (i,j) are the directions of the gradient of the image intensity at pixel (i,j) in the target image and template respectively, and the summation is over all active pixels in the template. The best-matching template is the one which minimizes this score.
The target image is matched against a template for a range of offsets around the default position. Typically the range of offsets is ±4 pixels in steps of 2 pixels.
As to using multiple reference images, the basic scheme is extended to make use of three reference images of the subject in the following way. The fronto-parallel reference image is used to generate an array of templates. The subject looks to the left, a left-facing reference image is taken, and the best-match template is computed. All entries in the array which correspond to more extreme left-turn rotations than the best-match are now regenerated, using the left-facing reference image. This is repeated on the right side. This provides better quality templates for the more extreme rotations of the head.
As to the pose-space histogram, the algorithm for processing head pose does not deliver accurate measurements of head orientation because the head model is approximate and the computable poses are quantized. However, it does allow identification of frequently adopted head poses, together with the relative orientation of those poses, and that information provides the basis for classifying the driver's view direction.
Corresponding to the 2D array of templates of the head, a 2D histogram of the same dimensions is set up. All elements in the array are initialized to zero. For each new target image of the driver, once the best-matching template has been found, the corresponding element in the histogram is incremented. Over an extended period, peaks will appear in the histogram for those head poses which are being most frequently adopted.
Ideally, one would expect to find a peak corresponding to the driver looking straight-ahead, a peak to the left of this for viewing the left-side mirror, and a peak to the right for viewing the rear-view mirror as illustrated in FIG. 4E. Observed peaks can be labelled automatically in accordance with this. Thereafter, for any acquired image of the driver, the best-matching template is found, the corresponding location in the histogram is indexed, and the target image is classified according to its proximity to a labelled peak. In this way, classification of the driver's focus of attention is achieved without any quantitative information about the 3D layout of the car.
As to results, some experiments were carried out on 32×32 images captured by the Artificial Retina of the Mitsubishi Electric Company. Others were carried out on 192×192 images captured by a Sony Hi-8 video camera. The processing speed is about 10 Hz for computing head pose with 32×32 images on an SGI workstation.
Since the main idea of the subject system is to avoid explicit measurement of the rotation angles of the head, no quantitative measurements about head pose are given, but various aspects of the performance of the system are illustrated.
Having now described a few embodiments of the invention, and some modifications and variations thereto, it should be apparent to those skilled in the art that the foregoing is merely illustrative and not limiting, having been presented by the way of example only. Numerous modifications and other embodiments are within the scope of one of ordinary skill in the art and are contemplated as falling within the scope of the invention as limited only by the appended claims and equivalence thereto.
|Cited Patent||Filing date||Publication date||Applicant||Title|
|US3805238 *||May 9, 1972||Apr 16, 1974||Rothfjell R||Method for identifying individuals using selected characteristic body curves|
|US5008946 *||Sep 9, 1988||Apr 16, 1991||Aisin Seiki K.K.||System for recognizing image|
|US5293427 *||Dec 11, 1991||Mar 8, 1994||Nissan Motor Company, Ltd.||Eye position detecting system and method therefor|
|US5454043 *||Jul 30, 1993||Sep 26, 1995||Mitsubishi Electric Research Laboratories, Inc.||Dynamic and static hand gesture recognition through low-level image analysis|
|US5481622 *||Mar 1, 1994||Jan 2, 1996||Rensselaer Polytechnic Institute||Eye tracking apparatus and method employing grayscale threshold values|
|US5886683 *||Jun 25, 1996||Mar 23, 1999||Sun Microsystems, Inc.||Method and apparatus for eyetrack-driven information retrieval|
|US6009210 *||Mar 5, 1997||Dec 28, 1999||Digital Equipment Corporation||Hands-free interface to a virtual reality environment using head tracking|
|US6061055 *||Mar 21, 1997||May 9, 2000||Autodesk, Inc.||Method of tracking objects with an imaging device|
|EP0305124A2 *||Aug 19, 1988||Mar 1, 1989||Lee S. Weinblatt||Monitoring technique for determining what location within a predetermined area is being viewed by a person|
|WO1990002453A1 *||Aug 23, 1989||Mar 8, 1990||Sebastiano Scarampi||Apparatus and method for monitoring television viewers|
|Citing Patent||Filing date||Publication date||Applicant||Title|
|US6496117 *||Mar 30, 2001||Dec 17, 2002||Koninklijke Philips Electronics N.V.||System for monitoring a driver's attention to driving|
|US6545706 *||Jul 30, 1999||Apr 8, 2003||Electric Planet, Inc.||System, method and article of manufacture for tracking a head of a camera-generated image of a person|
|US6724920 *||Jul 21, 2000||Apr 20, 2004||Trw Inc.||Application of human facial features recognition to automobile safety|
|US6741756 *||Sep 30, 1999||May 25, 2004||Microsoft Corp.||System and method for estimating the orientation of an object|
|US6766058 *||Aug 4, 1999||Jul 20, 2004||Electro Scientific Industries||Pattern recognition using multiple templates|
|US6792134 *||Dec 19, 2000||Sep 14, 2004||Eastman Kodak Company||Multi-mode digital image processing method for detecting eyes|
|US6859144 *||Feb 5, 2003||Feb 22, 2005||Delphi Technologies, Inc.||Vehicle situation alert system with eye gaze controlled alert signal generation|
|US6879969||Jan 19, 2002||Apr 12, 2005||Volvo Technological Development Corporation||System and method for real-time recognition of driving patterns|
|US6909455||Jan 28, 2003||Jun 21, 2005||Electric Planet, Inc.||System, method and article of manufacture for tracking a head of a camera-generated image of a person|
|US6980671 *||May 14, 2004||Dec 27, 2005||Microsoft Corporation||Rapid computer modeling of faces for animation|
|US6993163||Feb 1, 2005||Jan 31, 2006||Microsoft Corporation||Rapid computer modeling of faces for animation|
|US7020305||Dec 6, 2000||Mar 28, 2006||Microsoft Corporation||System and method providing improved head motion estimations for animation|
|US7039219||Dec 1, 2004||May 2, 2006||Microsoft Corporation||System and method providing improved head motion estimations for animation|
|US7050606||Nov 1, 2001||May 23, 2006||Cybernet Systems Corporation||Tracking and gesture recognition system particularly suited to vehicular control applications|
|US7065233||Apr 20, 2005||Jun 20, 2006||Microsoft Corporation||Rapid computer modeling of faces for animation|
|US7082212||Feb 1, 2005||Jul 25, 2006||Microsoft Corporation||Rapid computer modeling of faces for animation|
|US7106885 *||Sep 6, 2001||Sep 12, 2006||Carecord Technologies, Inc.||Method and apparatus for subject physical position and security determination|
|US7133540||Jan 26, 2006||Nov 7, 2006||Microsoft Corporation||Rapid computer modeling of faces for animation|
|US7142698||Dec 27, 2005||Nov 28, 2006||Microsoft Corporation||Rapid computer modeling of faces for animation|
|US7149329||Oct 19, 2004||Dec 12, 2006||Microsoft Corporation||Rapid computer modeling of faces for animation|
|US7149330||Jan 26, 2006||Dec 12, 2006||Microsoft Corporation||Rapid computer modeling of faces for animation|
|US7158658||May 2, 2005||Jan 2, 2007||Microsoft Corporation||Rapid computer modeling of faces for animation|
|US7174035||Oct 18, 2004||Feb 6, 2007||Microsoft Corporation||Rapid computer modeling of faces for animation|
|US7181051||May 2, 2005||Feb 20, 2007||Microsoft Corporation||Rapid computer modeling of faces for animation|
|US7197165||Feb 4, 2003||Mar 27, 2007||Canon Kabushiki Kaisha||Eye tracking using image data|
|US7212656||Jan 26, 2006||May 1, 2007||Microsoft Corporation||Rapid computer modeling of faces for animation|
|US7224834 *||Feb 22, 2001||May 29, 2007||Fujitsu Limited||Computer system for relieving fatigue|
|US7245741 *||Nov 14, 2000||Jul 17, 2007||Siemens Aktiengesellschaft||Method and device for determining whether the interior of a vehicle is occupied|
|US7274800 *||Jan 23, 2003||Sep 25, 2007||Intel Corporation||Dynamic gesture recognition from stereo sequences|
|US7284201||Sep 20, 2001||Oct 16, 2007||Koninklijke Philips Electronics N.V.||User attention-based adaptation of quality level to improve the management of real-time multi-media content delivery and distribution|
|US7362892 *||Jul 2, 2003||Apr 22, 2008||Lockheed Martin Corporation||Self-optimizing classifier|
|US7423540 *||Dec 23, 2005||Sep 9, 2008||Delphi Technologies, Inc.||Method of detecting vehicle-operator state|
|US7460940||Oct 15, 2003||Dec 2, 2008||Volvo Technology Corporation||Method and arrangement for interpreting a subjects head and eye activity|
|US7508979 *||Nov 18, 2004||Mar 24, 2009||Siemens Corporate Research, Inc.||System and method for detecting an occupant and head pose using stereo detectors|
|US7515773 *||Aug 29, 2005||Apr 7, 2009||Aisin Seiki Kabushiki Kaisha||Facial parts position detection device, method for detecting facial parts position, and program for detecting facial parts position|
|US7570785||Nov 29, 2007||Aug 4, 2009||Automotive Technologies International, Inc.||Face monitoring system and method for vehicular occupants|
|US7684592||Jan 14, 2008||Mar 23, 2010||Cybernet Systems Corporation||Realtime object tracking system|
|US7706575||Aug 4, 2004||Apr 27, 2010||Microsoft Corporation||System and method providing improved head motion estimations for animation|
|US7715476||Apr 21, 2005||May 11, 2010||Edwards Jeffrey L||System, method and article of manufacture for tracking a head of a camera-generated image of a person|
|US7742623||Aug 4, 2008||Jun 22, 2010||Videomining Corporation||Method and system for estimating gaze target, gaze sequence, and gaze map from video|
|US7760940 *||Aug 9, 2006||Jul 20, 2010||Fujifilm Corporation||Method, apparatus, and program for detecting objects in digital image|
|US7768528 *||Nov 3, 2006||Aug 3, 2010||Image Metrics Limited||Replacement of faces in existing video|
|US7916977||Dec 23, 2009||Mar 29, 2011||Sony Corporation||Data processing apparatus, data processing method and recording medium|
|US7940962 *||Aug 3, 2007||May 10, 2011||Delphi Technologies, Inc.||System and method of awareness detection|
|US7970175 *||Apr 30, 2007||Jun 28, 2011||Delphi Technologies, Inc.||Method and apparatus for assessing head pose of a vehicle driver|
|US7972266||May 22, 2007||Jul 5, 2011||Eastman Kodak Company||Image data normalization for a monitoring system|
|US8102417 *||Oct 25, 2006||Jan 24, 2012||Delphi Technologies, Inc.||Eye closure recognition system and method|
|US8184856 *||Jul 24, 2008||May 22, 2012||Delphi Technologies, Inc.||Method and apparatus for assessing driver head pose with a headrest-mounted relative motion sensor|
|US8219438||Jun 30, 2008||Jul 10, 2012||Videomining Corporation||Method and system for measuring shopper response to products based on behavior and facial expression|
|US8274578 *||Feb 9, 2009||Sep 25, 2012||Sungkyunkwan University Foundation For Corporate Collaboration||Gaze tracking apparatus and method using difference image entropy|
|US8401248||Dec 30, 2008||Mar 19, 2013||Videomining Corporation||Method and system for measuring emotional and attentional response to dynamic digital media content|
|US8406457 *||Mar 15, 2007||Mar 26, 2013||Omron Corporation||Monitoring device, monitoring method, control device, control method, and program|
|US8406464||Jun 22, 2006||Mar 26, 2013||Israel Aerospace Industries Ltd.||System and method for tracking moving objects|
|US8406484 *||Dec 29, 2010||Mar 26, 2013||Samsung Electronics Co., Ltd.||Facial recognition apparatus, method and computer-readable medium|
|US8452091 *||Jul 12, 2006||May 28, 2013||Samsung Electronics Co., Ltd.||Method and apparatus for converting skin color of image|
|US8570176||Dec 19, 2008||Oct 29, 2013||7352867 Canada Inc.||Method and device for the detection of microsleep events|
|US8599027 *||Oct 19, 2010||Dec 3, 2013||Deere & Company||Apparatus and method for alerting machine operator responsive to the gaze zone|
|US8616973||May 4, 2006||Dec 31, 2013||Sony Computer Entertainment Inc.||System and method for control by audible device|
|US8645985 *||Mar 6, 2006||Feb 4, 2014||Sony Computer Entertainment Inc.||System and method for detecting user attention|
|US8743192 *||Feb 24, 2011||Jun 3, 2014||Hon Hai Precision Industry Co., Ltd.||Electronic device and image capture control method using the same|
|US8792680||Dec 19, 2012||Jul 29, 2014||Israel Aerospace Industries Ltd.||System and method for tracking moving objects|
|US8805002 *||Dec 23, 2010||Aug 12, 2014||Metalo GmbH||Method of determining reference features for use in an optical object initialization tracking process and object initialization tracking method|
|US8885882||Jul 16, 2012||Nov 11, 2014||The Research Foundation For The State University Of New York||Real time eye tracking for human computer interaction|
|US8913792 *||Jun 13, 2014||Dec 16, 2014||Metaio Gmbh||Method of determining reference features for use in an optical object initialization tracking process and object initialization tracking method|
|US9058735||Jan 15, 2014||Jun 16, 2015||Industrial Technology Research Institute||Method and system for detecting conditions of drivers, and electronic apparatus thereof|
|US9117358||Jul 26, 2012||Aug 25, 2015||Volvo Car Corporation||Method for classification of eye closures|
|US9182819 *||Nov 17, 2014||Nov 10, 2015||Samsung Electronics Co., Ltd.||Eye gaze tracking method and apparatus and computer-readable recording medium|
|US9251402 *||May 13, 2011||Feb 2, 2016||Microsoft Technology Licensing, Llc||Association and prediction in facial recognition|
|US9256779 *||Sep 19, 2013||Feb 9, 2016||Alpine Electronics, Inc.||Gesture recognition apparatus, gesture recognition method, and recording medium|
|US9304593||Mar 26, 2013||Apr 5, 2016||Cybernet Systems Corporation||Behavior recognition system|
|US9323980 *||May 13, 2011||Apr 26, 2016||Microsoft Technology Licensing, Llc||Pose-robust recognition|
|US9323981 *||Oct 10, 2013||Apr 26, 2016||Casio Computer Co., Ltd.||Face component extraction apparatus, face component extraction method and recording medium in which program for face component extraction method is stored|
|US9460601||Jan 5, 2014||Oct 4, 2016||Tibet MIMAR||Driver distraction and drowsiness warning and sleepiness reduction for accident avoidance|
|US9491420||Mar 9, 2014||Nov 8, 2016||Tibet MIMAR||Vehicle security with accident notification and embedded driver analytics|
|US9650041||Dec 9, 2010||May 16, 2017||Honda Motor Co., Ltd.||Predictive human-machine interface using eye gaze technology, blind spot indicators and driver experience|
|US9798384||Oct 6, 2015||Oct 24, 2017||Samsung Electronics Co., Ltd.||Eye gaze tracking method and apparatus and computer-readable recording medium|
|US20020044682 *||Sep 6, 2001||Apr 18, 2002||Weil Josef Oster||Method and apparatus for subject physical position and security determination|
|US20020102010 *||Dec 6, 2000||Aug 1, 2002||Zicheng Liu||System and method providing improved head motion estimations for animation|
|US20020114495 *||Dec 19, 2000||Aug 22, 2002||Eastman Kodak Company||Multi-mode digital image processing method for detecting eyes|
|US20020126876 *||Nov 1, 2001||Sep 12, 2002||Paul George V.||Tracking and gesture recognition system particularly suited to vehicular control applications|
|US20020128751 *||Jan 19, 2002||Sep 12, 2002||Johan Engstrom||System and method for real-time recognition of driving patters|
|US20020176604 *||Apr 16, 2001||Nov 28, 2002||Chandra Shekhar||Systems and methods for determining eye glances|
|US20030052911 *||Sep 20, 2001||Mar 20, 2003||Koninklijke Philips Electronics N.V.||User attention-based adaptation of quality level to improve the management of real-time multi-media content delivery and distribution|
|US20030113018 *||Jan 23, 2003||Jun 19, 2003||Nefian Ara Victor||Dynamic gesture recognition from stereo sequences|
|US20030123734 *||Dec 28, 2001||Jul 3, 2003||Koninklijke Philips Electronics N.V.||Methods and apparatus for object recognition|
|US20030146901 *||Feb 4, 2003||Aug 7, 2003||Canon Kabushiki Kaisha||Eye tracking using image data|
|US20040150514 *||Feb 5, 2003||Aug 5, 2004||Newman Timothy J.||Vehicle situation alert system with eye gaze controlled alert signal generation|
|US20040179715 *||Apr 26, 2002||Sep 16, 2004||Jesper Nilsson||Method for automatic tracking of a moving body|
|US20040213438 *||May 14, 2004||Oct 28, 2004||Microsoft Corporation||Rapid computer modeling of faces for animation|
|US20040260930 *||Mar 31, 2004||Dec 23, 2004||Sumit Malik||Fingerprinting of data|
|US20050008196 *||Aug 4, 2004||Jan 13, 2005||Microsoft Corporation||System and method providing improved head motion estimations for animation|
|US20050047630 *||Oct 18, 2004||Mar 3, 2005||Microsoft Corporation||Rapid computer modeling of faces for animation|
|US20050053277 *||Oct 19, 2004||Mar 10, 2005||Microsoft Corporation||Rapid computer modeling of faces for animation|
|US20050073136 *||Oct 15, 2003||Apr 7, 2005||Volvo Technology Corporation||Method and arrangement for interpreting a subjects head and eye activity|
|US20050074145 *||Dec 1, 2004||Apr 7, 2005||Microsoft Corporation||System and method providing improved head motion estimations for animation|
|US20050100209 *||Jul 2, 2003||May 12, 2005||Lockheed Martin Corporation||Self-optimizing classifier|
|US20050111705 *||Aug 25, 2004||May 26, 2005||Roman Waupotitsch||Passive stereo sensing for 3D facial shape biometrics|
|US20050129315 *||Feb 1, 2005||Jun 16, 2005||Microsoft Corporation||Rapid computer modeling of faces for animation|
|US20050135660 *||Feb 1, 2005||Jun 23, 2005||Microsoft Corporation||Rapid computer modeling of faces for animation|
|US20050185054 *||Apr 21, 2005||Aug 25, 2005||Electric Planet, Inc.|
|US20050190962 *||Apr 20, 2005||Sep 1, 2005||Microsoft Corporation||Rapid computer modeling of faces for animation|
|US20050207623 *||May 2, 2005||Sep 22, 2005||Microsoft Corporation||Rapid computer modeling of faces for animation|
|US20050213820 *||May 2, 2005||Sep 29, 2005||Microsoft Corporation||Rapid computer modeling of faces for animation|
|US20050226509 *||Mar 30, 2005||Oct 13, 2005||Thomas Maurer||Efficient classification of three dimensional face models for human identification and other applications|
|US20060045382 *||Aug 29, 2005||Mar 2, 2006||Aisin Seiki Kabushiki Kaisha||Facial parts position detection device, method for detecting facial parts position, and program for detecting facial parts position|
|US20060104490 *||Jan 26, 2006||May 18, 2006||Microsoft Corporation||Rapid Computer Modeling of Faces for Animation|
|US20060104491 *||Jan 26, 2006||May 18, 2006||Microsoft Corporation||Rapid Computer Modeling of Faces for Animation|
|US20060110027 *||Jan 26, 2006||May 25, 2006||Microsoft Corporation||Rapid Computer Modeling of Faces for Animation|
|US20060126901 *||Mar 18, 2003||Jun 15, 2006||Bernhard Mattes||Device for determining the age of a person by measuring pupil size|
|US20060126924 *||Dec 27, 2005||Jun 15, 2006||Microsoft Corporation||Rapid Computer Modeling of Faces for Animation|
|US20070031033 *||Jul 12, 2006||Feb 8, 2007||Samsung Electronics Co., Ltd.||Method and apparatus for converting skin color of image|
|US20070036431 *||Aug 9, 2006||Feb 15, 2007||Fuji Photo Film Co., Ltd.||Method, apparatus, and program for detecting objects in digital image|
|US20070060350 *||May 4, 2006||Mar 15, 2007||Sony Computer Entertainment Inc.||System and method for control by audible device|
|US20070061413 *||Apr 10, 2006||Mar 15, 2007||Larsen Eric J||System and method for obtaining user information from voices|
|US20070061851 *||Mar 6, 2006||Mar 15, 2007||Sony Computer Entertainment Inc.||System and method for detecting user attention|
|US20070159344 *||Dec 23, 2005||Jul 12, 2007||Branislav Kisacanin||Method of detecting vehicle-operator state|
|US20070183651 *||Nov 18, 2004||Aug 9, 2007||Dorin Comaniciu||System and method for detecting an occupant and head pose using stereo detectors|
|US20070195997 *||May 23, 2006||Aug 23, 2007||Paul George V||Tracking and gesture recognition system particularly suited to vehicular control applications|
|US20070230797 *||Mar 29, 2007||Oct 4, 2007||Fujifilm Corporation||Method, apparatus, and program for detecting sightlines|
|US20070243930 *||Apr 12, 2006||Oct 18, 2007||Gary Zalewski||System and method for using user's audio environment to select advertising|
|US20070244751 *||Apr 17, 2006||Oct 18, 2007||Gary Zalewski||Using visual environment to select ads on game platform|
|US20070261077 *||May 8, 2006||Nov 8, 2007||Gary Zalewski||Using audio/visual environment to select ads on game platform|
|US20080069403 *||Nov 29, 2007||Mar 20, 2008||Automotive Technologies International, Inc.||Face Monitoring System and Method for Vehicular Occupants|
|US20080101659 *||Oct 25, 2006||May 1, 2008||Hammoud Riad I||Eye closure recognition system and method|
|US20080266552 *||Apr 30, 2007||Oct 30, 2008||Malawey Phillip V||Method and apparatus for assessing head pose of a vehicle driver|
|US20080267451 *||Jun 22, 2006||Oct 30, 2008||Uri Karazi||System and Method for Tracking Moving Objects|
|US20080288143 *||Jul 24, 2008||Nov 20, 2008||Smith Matthew R||Method and apparatus for assessing driver head pose with a headrest-mounted relative motion sensor|
|US20090022368 *||Mar 15, 2007||Jan 22, 2009||Omron Corporation||Monitoring device, monitoring method, control device, control method, and program|
|US20090034801 *||Aug 3, 2007||Feb 5, 2009||Hammoud Riad I||System and method of awareness detection|
|US20090116692 *||Jan 14, 2008||May 7, 2009||Paul George V||Realtime object tracking system|
|US20090123031 *||Nov 13, 2007||May 14, 2009||Smith Matthew R||Awareness detection system and method|
|US20090284608 *||Feb 9, 2009||Nov 19, 2009||Sungkyunkwan University Foundation For Corporate Collaboration||Gaze tracking apparatus and method using difference image entropy|
|US20090299209 *||Dec 19, 2008||Dec 3, 2009||Effective Control Transport, Inc.||Method and device for the detection of microsleep events|
|US20100080418 *||Sep 28, 2009||Apr 1, 2010||Atsushi Ito||Portable suspicious individual detection apparatus, suspicious individual detection method, and computer-readable medium|
|US20100098344 *||Dec 23, 2009||Apr 22, 2010||Tetsujiro Kondo||Data processing apparatus, data processing method and recording medium|
|US20100266206 *||Nov 3, 2008||Oct 21, 2010||Olaworks, Inc.||Method and computer-readable recording medium for adjusting pose at the time of taking photos of himself or herself|
|US20110128223 *||Jul 24, 2009||Jun 2, 2011||Koninklijke Phillips Electronics N.V.||Method of and system for determining a head-motion/gaze relationship for a user, and an interactive display system|
|US20110164792 *||Dec 29, 2010||Jul 7, 2011||Samsung Electronics Co., Ltd||Facial recognition apparatus, method and computer-readable medium|
|US20110194731 *||Dec 23, 2010||Aug 11, 2011||Metaio Gmbh||Method of determining reference features for use in an optical object initialization tracking process and object initialization tracking method|
|US20120007772 *||Mar 15, 2010||Jan 12, 2012||Paerssinen Aarno Tapio||Controller for a Directional Antenna and Associated Apparatus and Methods|
|US20120092173 *||Oct 19, 2010||Apr 19, 2012||Julian Sanchez||Alert generation|
|US20120098966 *||Feb 24, 2011||Apr 26, 2012||Hon Hai Precision Industry Co., Ltd.||Electronic device and image capture control method using the same|
|US20120288166 *||May 13, 2011||Nov 15, 2012||Microsoft Corporation||Association and prediction in facial recognition|
|US20120288167 *||May 13, 2011||Nov 15, 2012||Microsoft Corporation||Pose-robust recognition|
|US20140140624 *||Oct 10, 2013||May 22, 2014||Casio Computer Co., Ltd.||Face component extraction apparatus, face component extraction method and recording medium in which program for face component extraction method is stored|
|US20140153774 *||Sep 19, 2013||Jun 5, 2014||Alpine Electronics, Inc.||Gesture recognition apparatus, gesture recognition method, and recording medium|
|US20140321705 *||Jun 13, 2014||Oct 30, 2014||Metaio Gmbh|
|US20150116493 *||May 9, 2014||Apr 30, 2015||Xerox Corporation||Method and system for estimating gaze direction of vehicle drivers|
|US20150235538 *||Feb 14, 2014||Aug 20, 2015||GM Global Technology Operations LLC||Methods and systems for processing attention data from a vehicle|
|US20150293588 *||Nov 17, 2014||Oct 15, 2015||Samsung Electronics Co., Ltd.||Eye gaze tracking method and apparatus and computer-readable recording medium|
|US20160171321 *||Dec 14, 2015||Jun 16, 2016||Aisin Seiki Kabushiki Kaisha||Determination apparatus and determination method|
|CN1298285C *||Mar 22, 2004||Feb 7, 2007||长安大学||Kineto plast sight detector for automobile driver|
|CN100398065C *||Oct 15, 2003||Jul 2, 2008||沃尔沃技术公司||Method and arrangement for interpreting a subjects head and eye activity|
|CN101317763B||Oct 15, 2003||Apr 3, 2013||沃尔沃技术公司||Method and arrangement for interpreting a subjects head and eye activity|
|CN102457670A *||Oct 26, 2010||May 16, 2012||鸿富锦精密工业（深圳）有限公司||Control system and method for camera device|
|CN102457670B *||Oct 26, 2010||Dec 7, 2016||鸿富锦精密工业（深圳）有限公司||摄影机装置控制系统及方法|
|EP2564777A1 *||Sep 2, 2011||Mar 6, 2013||Volvo Car Corporation||Method for classification of eye closures|
|WO2003026250A1||Sep 16, 2002||Mar 27, 2003||Koninklijke Philips Electronics N.V.||Quality adaption for real-time multimedia content delivery based on user attention|
|WO2004034905A1 *||Oct 15, 2003||Apr 29, 2004||Volvo Technology Corporation||Method and arrangement for interpreting a subjects head and eye activity|
|WO2005081677A2 *||Aug 26, 2004||Sep 9, 2005||Geometrix, Inc.||Passive stereo sensing for 3d facial shape biometrics|
|WO2005081677A3 *||Aug 26, 2004||Aug 17, 2006||Geometrix Inc||Passive stereo sensing for 3d facial shape biometrics|
|WO2008150345A1 *||May 12, 2008||Dec 11, 2008||Eastman Kodak Company||Image data normalization for a monitoring system|
|WO2014062107A1 *||Oct 19, 2012||Apr 24, 2014||Autoliv Development Ab||Driver attentiveness detection method and device|
|WO2014155133A1 *||Mar 28, 2014||Oct 2, 2014||Eye Tracking Analysts Ltd||Eye tracking calibration|
|U.S. Classification||382/103, 382/170, 340/576, 382/104|
|International Classification||G06T7/20, A61B3/113, A61B5/117, G06K9/00, G06T7/60, G06T7/00|
|Cooperative Classification||G06K9/00845, G06T7/74, G06K9/00604, G06K9/00228, A61B5/1176, A61B3/113, A61B5/7264|
|European Classification||G06T7/00P1E, A61B5/72K12, G06K9/00F1, A61B3/113, A61B5/117F|
|Oct 1, 1998||AS||Assignment|
Owner name: MITSUBISHI ELECTRIC INFORMATION TECHNOLOGY CENTER
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:BEARDSLEY, PAUL ANTHONY;REEL/FRAME:009498/0924
Effective date: 19980930
|Jan 23, 2001||AS||Assignment|
Owner name: MITSUBISHI ELECTRIC RESEARCH LABORATORIES, INC., M
Free format text: CHANGE OF NAME;ASSIGNOR:MITSUBISHI ELECTRIC INFORMATION TECHNOLOGY CENTER AMERICA, INC.;REEL/FRAME:011564/0329
Effective date: 20000828
|Jun 16, 2004||REMI||Maintenance fee reminder mailed|
|Nov 18, 2004||FPAY||Fee payment|
Year of fee payment: 4
|Nov 18, 2004||SULP||Surcharge for late payment|
|May 21, 2008||FPAY||Fee payment|
Year of fee payment: 8
|May 29, 2012||FPAY||Fee payment|
Year of fee payment: 12