|Publication number||US7598975 B2|
|Application number||US 10/978,172|
|Publication date||Oct 6, 2009|
|Filing date||Oct 30, 2004|
|Priority date||Jun 21, 2002|
|Also published as||CA2521670A1, CA2521670C, CN1783998A, CN1783998B, EP1659518A2, EP1659518A3, US20050285943|
|Publication number||10978172, 978172, US 7598975 B2, US 7598975B2, US-B2-7598975, US7598975 B2, US7598975B2|
|Inventors||Ross G. Cutler|
|Original Assignee||Microsoft Corporation|
|Export Citation||BiBTeX, EndNote, RefMan|
|Patent Citations (79), Non-Patent Citations (17), Referenced by (14), Classifications (34), Legal Events (3)|
|External Links: USPTO, USPTO Assignment, Espacenet|
This application is a continuation-in-part of U.S. patent application Ser. No. 10/177,315, entitled “A System and Method for Distributed Meetings”, filed Jun. 21, 2002 now U.S. Pat. No. 7,259,784 by the present inventor and assigned to Microsoft Corp., the assignee of the present application. Applicant claims priority to the filing date of said application, which is hereby incorporated by reference for all that it discloses and teaches.
The following description relates generally to video image processing. More particularly, the following description relates to providing an indexed timeline for video playback.
Playback of recorded video of scenarios that include more than one speaker—such as playback of a recorded meeting—is usually shown contemporaneously with an indexed timeline. Using the timeline, a user can quickly move to a particular time in the meeting by manipulating one or more timeline controls. When the video includes more than one speaker, multiple timelines may be used where one timeline is associated with a particular speaker. Each timeline indicates when a corresponding speaker is speaking. That way, a user can navigate to portions of the meeting where a particular speaker is speaking.
Such multiple timelines may be labeled in a generic fashion to identify each speaker as, for example, “Speaker 1,” “Speaker 2,” etc. Current techniques for automatically labeling timelines with specific speaker names are inaccurate and also may require a database of users and their associated voiceprints and faceprints, which could entail security and privacy issues.
The foregoing aspects and many of the attendant advantages of this invention will become more readily appreciated as the same become better understood by reference to the following detailed description, when taken in conjunction with the accompanying drawings, wherein:
The following description relates to various implementations and embodiments for automatically detecting each speaker's face in a multi-speaker environment and associating one or more images of a speaker's face with a portion of a timeline that corresponds to the speaker. This sort of specific labeling has advantages over generic labeling in that a viewer can more readily determine which portion of a timeline corresponds to a particular one of multiple speakers.
In the following discussion, an instance of a panoramic camera is described wherein the panoramic camera is used to record a meeting having more than one participant and/or speaker. Although a panoramic camera including multiple cameras is described, the following description also relates to single cameras and multi-camera devices having two or more cameras.
A panoramic image is input to a face tracker (FT) which detects and tracks faces in the meeting. A microphone array is input to a sound source localizer (SSL) which detects locations of speakers based on sound. The outputs from the face tracker and from the sound source localizer are input to a virtual cinematographer to detect locations of the speakers.
The speakers are post-processed with a speaker clustering module which clusters speakers temporally and spatially to better delineate an aggregate timeline that includes two or more individual timelines. The (aggregate) timeline is stored in a timeline database. A faces database is created to store one or more images for each speaker, at least one of each face to be used in a timeline associated with a speaker.
The concepts presented and claimed herein are described in greater detail, below, with regard to one or more appropriate operating environments. Some of the elements described below are also described in parent U.S. patent application Ser. No. 10/177,315, entitled “A System and Method for Distributed Meetings”, filed Jun. 21, 2002 and incorporated by reference above.
Exemplary Operating Environment
The described techniques and objects are operational with numerous other general purpose or special purpose computing system environments or configurations. Examples of well known computing systems, environments, and/or configurations that may be suitable for use include, but are not limited to, personal computers, server computers, hand-held or laptop devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
The following description may be couched in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The described implementations may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
With reference to
Computer 110 typically includes a variety of computer readable media. Computer readable media can be any available media that can be accessed by computer 110 and includes both volatile and nonvolatile media, removable and non-removable media. By way of example, and not limitation, computer readable media may comprise computer storage media and communication media. Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by computer 110. Communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. The term “modulated data signal” means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared and other wireless media. Combinations of the any of the above should also be included within the scope of computer readable media.
The system memory 130 includes computer storage media in the form of volatile and/or nonvolatile memory such as read only memory (ROM) 131 and random access memory (RAM) 132. A basic input/output system 133 (BIOS), containing the basic routines that help to transfer information between elements within computer 110, such as during start-up, is typically stored in ROM 131. RAM 132 typically contains data and/or program modules that are immediately accessible to and/or presently being operated on by processing unit 120. By way of example, and not limitation,
The computer 110 may also include other removable/non-removable, volatile/nonvolatile computer storage media. By way of example only,
The drives and their associated computer storage media discussed above and illustrated in
The computer 110 may operate in a networked environment using logical connections to one or more remote computers, such as a remote computer 180. The remote computer 180 may be a personal computer, a server, a router, a network PC, a peer device or other common network node, and typically includes many or all of the elements described above relative to the computer 110, although only a memory storage device 181 has been illustrated in
When used in a LAN networking environment, the computer 110 is connected to the LAN 171 through a network interface or adapter 170. When used in a WAN networking environment, the computer 110 typically includes a modem 172 or other means for establishing communications over the WAN 173, such as the Internet. The modem 172, which may be internal or external, may be connected to the system bus 121 via the user input interface 160, or other appropriate mechanism. In a networked environment, program modules depicted relative to the computer 110, or portions thereof, may be stored in the remote memory storage device. By way of example, and not limitation,
Exemplary Panoramic Camera and Client Device
The panoramic camera apparatus 200 includes a processor 202 and memory 204. The panoramic camera apparatus 200 creates a panoramic image by stitching together several individual images produced by multiple cameras 206 (designated 206_1 through 206_n). The panoramic image may be a complete 360° panoramic image or it may be only a portion thereof. It is noted that although a panoramic camera apparatus 200 is shown and described herein, the described techniques may also be utilized with a single camera.
The panoramic camera apparatus 200 also includes a microphone array 208. As will be described in greater detail below, the microphone array is configured so that sound direction may be localized. In other words, analysis of sound input into the microphone array yields a direction from which a detected sound is produced. A speaker 210 may also be included in the panoramic camera apparatus 200 to enable a speakerphone or to emit notification signals and the like to users.
The memory 204 stores several camera settings 212 such as calibration data, exposure settings, stitching tables, etc. An operating system 214 that controls camera functions is also stored in the memory 204 along with one or more other camera software applications 216.
The panoramic camera apparatus 200 also includes an input/output (I/O) module 218 for transmitting data from and receiving data to the panoramic camera apparatus 200, and miscellaneous other hardware 220 elements that may be required for camera functionality.
The panoramic camera apparatus 200 communicates with at least one client device 222, which includes a processor 224, memory 226, a mass storage device 242 (such as a hard disk drive) and other hardware 230 that may be required to execute the functionality attributed to the client device 222 below.
The memory 226 stores a face tracker (FT) module 230 and a sound source localization (SSL) module 232. The face tracker module 230 and the sound source localization module 232 are used in conjunction with a virtual cinematographer 234 to detect a person in a camera scene and determine if and when the person is speaking. Any of several conventional methods of sound source localization may be used. Various face tracker methods (or person detection and tracking systems), including the one described in the parent application hereto, may be used as described herein.
The memory 226 also stores a speaker clustering module 236 that is configured to determine a primary speaker when two or more persons are speaking and concentrate a particular timeline portion to the primary speaker. In most meeting situations, there are instances where more than one person talks at the same time. Usually, a primary speaker is speaking when another person interrupts the speaker for a short period or talks over the speaker. The speaker clustering module 236 is configured to cluster speakers temporally and spatially to clean up the timeline.
A timeline 238 is created by the virtual cinematographer 234. The timeline 238 is stored in a timeline database 244 on the mass storage device 242. The timeline database 238 includes a plurality of fields including, but not necessarily limited to, time, speaker number, and speaker bounding box within a camera image (x, y, width, height). The timeline database 238 may also include one or more speaker face angles (azimuth and elevation).
A face extractor module 240 is also stored in the memory 226 and is configured to extract an image of a speaker's face from a face bounding box (identified by the face tracker 230) of a camera image. The face extractor module 240 stores extracted facial images in a face database 246 on the mass storage device 242.
In at least one implementation, multiple facial images may be stored for one or more speakers. Parameters can be specified to determine which facial image is used at which particular times. Or, a user may be able to manually select a particular facial image from the multiple facial images.
In at least one alternative implementation, only a single facial image is stored for each speaker. The stored facial image may be a single image extracted by the face extractor module 240, but the face extractor module 240 may also be configured to select a best image of a speaker.
Selecting a best image of a speaker can be accomplished by identifying frontal facial angles (on an assumption that an image with a frontal facial image is a better representation than an alternative image), by identifying a facial image that exhibits a minimum of motion or by identifying a facial image that maximizes facial symmetry.
The recorded meeting 248 is also stored on the mass storage device 242 so that it can be recalled and played back at a later time.
The elements and functionality shown and described with regard to
Exemplary Playback Screen
The exemplary playback screen 300 also includes a controls section 310 that contains controls typically found in a media player, such as a play button, a fast forward button, a rewind button, etc. An information area 312 is included in the playback screen 300 where information regarding the subject matter of the playback screen 300 may be displayed. For example, a meeting title, a meeting room number, a list of meeting attendees, and the like may be displayed in the information area 312.
The facial image timeline 304 includes a first sub-timeline 314 that corresponds to the first meeting participant 303 and a second sub-timeline 316 that corresponds to the second meeting participant. Each sub-timeline 314, 316 indicates sections along a temporal continuum where the corresponding meeting participant is speaking. A user may directly access any point on a sub-timeline 314, 316 to immediately access a portion of the meeting wherein a particular meeting participant is speaking.
A first facial image 318 of the first meeting participant 303 appears adjacent to the first sub-timeline 314 to indicate that the first sub-timeline 314 is associated with the first meeting participant 318. A facial image 320 of the second meeting participant 305 appears adjacent to the second sub-timeline 316 to indicate that the second sub-timeline 316 is associated with the second meeting participant 305.
The exemplary playback screen 400 includes a panoramic image 302 and a facial image timeline 304. The panoramic image 302 shows a first meeting participant 303 and a second meeting participant 305. A title bar 306 spans the top of the playback screen 400 and an individual image 408 shows the second meeting participant 303.
The exemplary playback screen 400 also includes a whiteboard speaker image 402 that displays a meeting participant (in this case, the second meeting participant 305) that is situated before a whiteboard. The whiteboard speaker image 402 is not included in the playback screen 300 of
A controls section 310 includes multimedia controls and an information area 312 displays information regarding the meeting shown on the playback screen 400.
The facial image timeline 304 includes a first sub-timeline 314, a second sub-timeline 316 and a third sub-timeline 404. It is noted that while only two sub-timelines are shown in
It is noted that while there are only two meeting participants in this example, there are three sub-timelines. This is because a single speaker may be associated with more than a single sub-timeline. In the present example, the second sub-timeline 316 is associated with the second meeting participant 305 while the second meeting participant 305 is at the whiteboard, and the third sub-timeline 404 is associated with the second meeting participant 305 while the second meeting participant 305 is situated at a location other than the whiteboard.
This situation can happen when a meeting participant occupies more than one location during a meeting. The virtual cinematographer 234 in this case has detected speakers in three locations. It does not necessarily know that only two speakers are present in those locations. This feature assists a user in cases where the user is interested mainly in a speaker when the speaker is in a certain position. For example, a user may only want to play a portion of a recorded meeting when a speaker is situated at the whiteboard.
The exemplary playback screen 400 also includes a first facial image 318 of the first meeting participant 303 and a second facial image 320 of the second meeting participant 305. In addition, a third facial image 406 is included and is associated with the third sub-timeline 404. The third facial image 406 corresponds with a second location of the second meeting participant 305.
The techniques used in presenting the exemplary playback screens 300, 400 will be described in greater detail below, with respect to the other figures.
Exemplary Methodological Implementation: Creation of Facial Image Timeline
At block 502, the panoramic camera apparatus 200 samples one or more video images to create a panoramic image. The panoramic image is input to the face tracker 230 (block 504) which detects and tracks faces in the image. Approximately simultaneously at block 506, the microphone array 208 samples sound corresponding to the panoramic image and inputs the sound into the sound source localizer 232 which detects locations of speakers based on the sampled sound at block 508.
The virtual cinematographer 234 processes data from the face tracker 230 and the sound source localizer 232 to create the timeline 238 at block 510. At block 512, the speaker clustering module 236 clusters speakers temporally and spatially to consolidate and clarify portions of the timeline 238 as described previously.
The timeline is stored in the timeline database 244 with the following fields: time, speaker number, speaker bounding box in image (x, y, width, height), speaker face angles (azimuth, elevation), etc.
Using the panoramic image and face identification coordinates (i.e. face bounding boxes) derived by the face tracker 230, the face extractor 240 extracts a facial image of the speakers at block 514. Extracted facial images are stored in the faces database 246 and are associated with a speaker number.
As previously noted, the face extractor 240 may be configured to extract more than one image for each speaker and use what the face extractor 240 determines to be the best image in the timeline 238.
An exemplary methodological implementation of selecting a “best” facial image and creating the faces database 246 is shown and described below, with respect to
Exemplary Methodological Implementation: Creating a Faces Database
At block 602, the face extractor 240 extracts a facial image from the panoramic image as described above. If a facial image for the speaker is not already stored in the faces database 246 (“No” branch, block 604), then the facial image is stored in the faces database 246 at block 610. It is noted that determining if the facial image is stored does not necessarily depend on whether the person who appears in the facial image already has an image of their likeness stored, but whether the identified speaker has an image already stored that corresponds to the speaker. Thus, if a speaker located in a first position has a stored facial image and then the speaker is detected at a second location, a facial image of the speaker in the second location will not be compared with the stored facial image of the speaker in the first position to determine if the speaker already has a facial image stored.
If a facial image for the speaker is already stored in the faces database 246—hereinafter, the “stored facial image”—(“Yes” branch, block 604), then the facial image is compared to the stored facial image at block 606. If the face extractor 240 determines that the facial image is better or more acceptable than the stored facial image (“Yes” branch, block 608), then the facial image is stored in the faces database 246, thus overwriting the previously stored facial image.
If the facial image is not better than the stored facial image (“No” branch, block 608), then the facial image is discarded and the stored facial image is retained.
The criteria for determining which facial image is a better facial image can be numerous and varied. For instance, the face extractor 234 may be configured to determine that a “best” facial image is one that captures a speaker in a position where the speaker's face is most in a frontal position. Or, if a first facial image shows signs of motion and a second facial image does not, then the face extractor 246 may determine that the second facial image is the best facial image. Or, the face extractor 246 may be configured to determine which of multiple images of a speaker exhibits maximum symmetry and to use that facial image in the timeline. Other criteria not enumerated here may also be used to determine the most appropriate facial image to utilize with the timeline.
If there is another speaker (“Yes” branch, block 612), then the process reverts to block 602 and is repeated for each unique speaker. Again, “unique speaker” as used in this context does not necessarily mean a unique person, since a person that appears in different speaking locations may be interpreted as being different speakers. The process terminates when there are no more unique speakers to identify (“No” branch, block 612).
While one or more exemplary implementations have been illustrated and described, it will be appreciated that various changes can be made therein without departing from the spirit and scope of the claims appended hereto.
|Cited Patent||Filing date||Publication date||Applicant||Title|
|US4626893 *||Apr 18, 1984||Dec 2, 1986||Kabushiki Kaisha Toshiba||White balance adjusting system for plural color television cameras|
|US5504524 *||Oct 13, 1994||Apr 2, 1996||Vlsi Vision Limited||Method and apparatus for controlling color balance of a video signal|
|US5539483||Jun 30, 1995||Jul 23, 1996||At&T Corp.||Panoramic projection apparatus|
|US5745305||Apr 28, 1995||Apr 28, 1998||Lucent Technologies Inc.||Panoramic viewing apparatus|
|US5793527||Jun 30, 1995||Aug 11, 1998||Lucent Technologies Inc.||High resolution viewing system|
|US5990934||Apr 28, 1995||Nov 23, 1999||Lucent Technologies, Inc.||Method and system for panoramic viewing|
|US6005611||Aug 4, 1998||Dec 21, 1999||Be Here Corporation||Wide-angle image dewarping method and apparatus|
|US6043837||May 8, 1997||Mar 28, 2000||Be Here Corporation||Method and apparatus for electronically distributing images from a panoptic camera system|
|US6101287 *||May 27, 1998||Aug 8, 2000||Intel Corporation||Dark frame subtraction|
|US6111702||Oct 7, 1997||Aug 29, 2000||Lucent Technologies Inc.||Panoramic viewing system with offset virtual optical centers|
|US6115176||Nov 30, 1995||Sep 5, 2000||Lucent Technologies Inc.||Spherical viewing/projection apparatus|
|US6128143||Aug 28, 1998||Oct 3, 2000||Lucent Technologies Inc.||Panoramic viewing system with support stand|
|US6141145||Aug 28, 1998||Oct 31, 2000||Lucent Technologies||Stereo panoramic viewing system|
|US6144501||Aug 28, 1998||Nov 7, 2000||Lucent Technologies Inc.||Split mirrored panoramic image display|
|US6175454||Jan 13, 1999||Jan 16, 2001||Behere Corporation||Panoramic imaging arrangement|
|US6195204||Aug 28, 1998||Feb 27, 2001||Lucent Technologies Inc.||Compact high resolution panoramic viewing system|
|US6219089||Jun 24, 1999||Apr 17, 2001||Be Here Corporation||Method and apparatus for electronically distributing images from a panoptic camera system|
|US6219090||Nov 1, 1999||Apr 17, 2001||Lucent Technologies Inc.||Panoramic viewing system with offset virtual optical centers|
|US6222683||Jul 31, 2000||Apr 24, 2001||Be Here Corporation||Panoramic imaging arrangement|
|US6285365||Aug 28, 1998||Sep 4, 2001||Fullview, Inc.||Icon referenced panoramic image display|
|US6313865||Jan 10, 2000||Nov 6, 2001||Be Here Corporation||Method and apparatus for implementing a panoptic camera system|
|US6331869||Aug 7, 1998||Dec 18, 2001||Be Here Corporation||Method and apparatus for electronically distributing motion panoramic images|
|US6337708||Apr 21, 2000||Jan 8, 2002||Be Here Corporation||Method and apparatus for electronically distributing motion panoramic images|
|US6341044||Oct 19, 1998||Jan 22, 2002||Be Here Corporation||Panoramic imaging arrangement|
|US6346967||Oct 28, 1999||Feb 12, 2002||Be Here Corporation||Method apparatus and computer program products for performing perspective corrections to a distorted image|
|US6356296||May 8, 1997||Mar 12, 2002||Behere Corporation||Method and apparatus for implementing a panoptic camera system|
|US6356397||Dec 27, 1999||Mar 12, 2002||Fullview, Inc.||Panoramic viewing system with shades|
|US6369818||Nov 25, 1998||Apr 9, 2002||Be Here Corporation||Method, apparatus and computer program product for generating perspective corrected data from warped information|
|US6373642||Aug 20, 1998||Apr 16, 2002||Be Here Corporation||Panoramic imaging arrangement|
|US6388820||Nov 26, 2001||May 14, 2002||Be Here Corporation||Panoramic imaging arrangement|
|US6392687||Aug 4, 2000||May 21, 2002||Be Here Corporation||Method and apparatus for implementing a panoptic camera system|
|US6424377||Jul 11, 2000||Jul 23, 2002||Be Here Corporation||Panoramic camera|
|US6426774||Jul 13, 2000||Jul 30, 2002||Be Here Corporation||Panoramic camera|
|US6459451||Jun 11, 1997||Oct 1, 2002||Be Here Corporation||Method and apparatus for a panoramic camera to capture a 360 degree image|
|US6466254||Jun 7, 2000||Oct 15, 2002||Be Here Corporation||Method and apparatus for electronically distributing motion panoramic images|
|US6480229||Jul 17, 2000||Nov 12, 2002||Be Here Corporation||Panoramic camera|
|US6493032||Nov 12, 1999||Dec 10, 2002||Be Here Corporation||Imaging arrangement which allows for capturing an image of a view at different resolutions|
|US6515696||Apr 25, 2000||Feb 4, 2003||Be Here Corporation||Method and apparatus for presenting images from a remote location|
|US6535649 *||May 11, 1999||Mar 18, 2003||Umax Data Systems Inc.||Dynamic calibration method|
|US6539547||Apr 3, 2001||Mar 25, 2003||Be Here Corporation||Method and apparatus for electronically distributing images from a panoptic camera system|
|US6583815||Aug 14, 2000||Jun 24, 2003||Be Here Corporation||Method and apparatus for presenting images from a remote location|
|US6593969||Mar 8, 2000||Jul 15, 2003||Be Here Corporation||Preparing a panoramic image for presentation|
|US6597520||Apr 9, 2002||Jul 22, 2003||Be Here Corporation||Panoramic imaging arrangement|
|US6628825 *||Jun 22, 1999||Sep 30, 2003||Canon Kabushiki Kaisha||Image processing method, apparatus and memory medium therefor|
|US6700711||Mar 8, 2002||Mar 2, 2004||Fullview, Inc.||Panoramic viewing system with a composite field of view|
|US6741250||Oct 17, 2001||May 25, 2004||Be Here Corporation||Method and system for generation of multiple viewpoints into a scene viewed by motionless cameras and for presentation of a view path|
|US6756990||Apr 3, 2001||Jun 29, 2004||Be Here Corporation||Image filtering on 3D objects using 2D manifolds|
|US6788340 *||Dec 30, 1999||Sep 7, 2004||Texas Instruments Incorporated||Digital imaging control with selective intensity resolution enhancement|
|US6795106||May 18, 1999||Sep 21, 2004||Intel Corporation||Method and apparatus for controlling a video camera in a video conferencing system|
|US6885509||Jun 27, 2001||Apr 26, 2005||Be Here Corporation||Imaging arrangement which allows for capturing an image of a view at different resolutions|
|US6917702 *||Apr 24, 2002||Jul 12, 2005||Mitsubishi Electric Research Labs, Inc.||Calibration of multiple cameras for a turntable-based 3D scanner|
|US6924832||Sep 11, 2000||Aug 2, 2005||Be Here Corporation||Method, apparatus & computer program product for tracking objects in a warped video image|
|US20020034020||Nov 26, 2001||Mar 21, 2002||Be Here Corporation||Panoramic imaging arrangement|
|US20020041324||Sep 26, 2001||Apr 11, 2002||Kozo Satoda||Video conference system|
|US20020063802||Dec 10, 2001||May 30, 2002||Be Here Corporation||Wide-angle dewarping method and apparatus|
|US20020094132||Jan 24, 2002||Jul 18, 2002||Be Here Corporation||Method, apparatus and computer program product for generating perspective corrected data from warped information|
|US20020154417||Apr 9, 2002||Oct 24, 2002||Be Here Corporation||Panoramic imaging arrangement|
|US20030142402||Jan 30, 2002||Jul 31, 2003||Be Here Corporation||Method and apparatus for triggering a remote flash on a camera with a panoramic lens|
|US20030146982 *||Feb 1, 2002||Aug 7, 2003||Tindall John R.||Special color pigments for calibrating video cameras|
|US20030184660 *||Apr 2, 2002||Oct 2, 2003||Michael Skow||Automatic white balance for digital imaging|
|US20030193606||Apr 17, 2003||Oct 16, 2003||Be Here Corporation||Panoramic camera|
|US20030193607||Apr 17, 2003||Oct 16, 2003||Be Here Corporation||Panoramic camera|
|US20030220971||Aug 16, 2002||Nov 27, 2003||International Business Machines Corporation||Method and apparatus for video conferencing with audio redirection within a 360 degree view|
|US20040008407||Jan 3, 2003||Jan 15, 2004||Be Here Corporation||Method for designing a lens system and resulting apparatus|
|US20040008423||Jun 12, 2003||Jan 15, 2004||Driscoll Edward C.||Visual teleconferencing apparatus|
|US20040021764||Jan 3, 2003||Feb 5, 2004||Be Here Corporation||Visual teleconferencing apparatus|
|US20040252384||Jun 12, 2003||Dec 16, 2004||Wallerstein Edward P.||Panoramic imaging system|
|US20040254982||Jun 12, 2003||Dec 16, 2004||Hoffman Robert G.||Receiving system for video conferencing system|
|US20050046703||Sep 30, 2004||Mar 3, 2005||Cutler Ross G.||Color calibration in photographic devices|
|US20050078172 *||Oct 9, 2003||Apr 14, 2005||Michael Harville||Method and system for coordinating communication devices to create an enhanced representation of an ongoing event|
|US20050117034||Dec 30, 2004||Jun 2, 2005||Microsoft Corp.||Temperature compensation in multi-camera photographic devices|
|US20050151837||Dec 30, 2004||Jul 14, 2005||Microsoft Corp.||Minimizing dead zones in panoramic images|
|US20060198554 *||Nov 28, 2003||Sep 7, 2006||Porter Robert M S||Face detection|
|WO1998047291A2||Mar 26, 1998||Oct 22, 1998||Isight Ltd.||Video teleconferencing|
|WO2000011512A1||Sep 25, 1998||Mar 2, 2000||Be Here Corporation||A panoramic imaging arrangement|
|WO2004004320A1||Jul 1, 2003||Jan 8, 2004||The Regents Of The University Of California||Digital processing of video images|
|WO2004111689A2||Jun 10, 2004||Dec 23, 2004||Be Here Corporation||Panoramic imaging system|
|WO2004112290A2||Jun 10, 2004||Dec 23, 2004||Be Here Corporation||Receiving system for video conferencing system|
|WO2005002201A2||Jun 10, 2004||Jan 6, 2005||Be Here Corporation||Visual teleconferencing apparatus|
|1||Applicant's Statement: The references cited in this IDS were previously submitted to the Office in connection with U.S. Appl. No. 10/177,315 filed Jun. 21, 2002 and again in connection with the present Application, the references being filed on Oct. 22, 2008. Applicant points out that for any reference listed in the accompanying IDS, should a year of publication be listed without a month, that the year of publication is sufficiently earlier than the effective U.S. filed and any foreign priority date so that the particular month of publication is not in issue (see MPEP 609.04(a)).|
|2||Charfi, et, al., "Focusing Criterion", Electronic Letters, vol. 27, No. 14, pp. 1233-1235, Jul. 1991.|
|3||Choi, et, al., "New Autofocusing Technique Using the Frequency Selective Weighted Median Filter for Video Cameras", IEEE Transactions on Consumer Electronics, vol. 45, No. 3, Aug. 1999.|
|4||Davis, J. "Mosaics of scenes with Moving Objects" IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1998.|
|5||European Patent Office European Search Report, for European Patent Application No. 05 111 765.3, Examiner J. Horstmannshoff, Apr. 4, 2006, Munich.|
|6||Hasler, et, al "Colour Handling in Panoramic Photography", Proceedings of SPIE vol. 4309, 2001.|
|7||Healey, et al., "Radiometric CCD Camera Calibration and Noise Estimation", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 16, No. 3, pp. 267-276, Apr. 1994.|
|8||J. Davis "Mosaics of scenes with moving objects", Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Jun. 23-25, 1998, pp. 354-360.|
|9||Kang, et al. "Can We Calibrate a Camera using an Image of a Flat, Textureless Lambertian Surface", ECCV, 2000.|
|10||Kemberova , et al., The Effect of Radiometric Correction on Multicamera Algorithms' Technical Report MS-CIS-97-17, 1997.|
|11||Majumder, et, al., "Achieving Color Uniformity Across Multi-Projector Displays", IEEE Visualization 2000.|
|12||Subbarao, et, al., "Selecting the Optimal Focus Measure for Autofocusing and Depth-From-Focus", IEEE Transactions on PAMI, vol. 20, No. 8, August 1998.|
|13||Szeliski, et, al., "Creating full view panoramic image mosaics and Environment Maps" Computer Graphics (SIGGRAPH '97), pp. 251-258, 1997.|
|14||Szeliski, et, al., "Video mosaics for virtual environments" IEEE Computer Graphics and Applications, pp. 22-30, Mar. 1996.|
|15||Uyttendale, et, al., "Eliminating ghosting and exposure artifacts in image mosaics" In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'2001), vol. II pp. 509-516, Kauai, Hawaii, Dec. 2001.|
|16||Widjaja, "Use of Wavelet Analysis for Improving Autofocusing Capability", Optics Communications 151, pp. 12-14, 1998.|
|17||Zhang, et, al., "Real-Time Multi-View Face Detection", Face and Gesture Recognition, 2002, Washington D.C., May 2002.|
|Citing Patent||Filing date||Publication date||Applicant||Title|
|US7808521 *||Jan 9, 2006||Oct 5, 2010||Apple Inc.||Multimedia conference recording and manipulation interface|
|US8241125 *||Jul 17, 2007||Aug 14, 2012||Sony Computer Entertainment Europe Limited||Apparatus and method of interaction with a data processor|
|US8687076 *||Sep 22, 2011||Apr 1, 2014||Samsung Electronics Co., Ltd.||Moving image photographing method and moving image photographing apparatus|
|US8868657||Dec 17, 2010||Oct 21, 2014||Avaya Inc.||Method and system for generating a collaboration timeline illustrating application artifacts in context|
|US8949123||Apr 11, 2012||Feb 3, 2015||Samsung Electronics Co., Ltd.||Display apparatus and voice conversion method thereof|
|US9020120||Feb 6, 2013||Apr 28, 2015||Avaya Inc.||Timeline interface for multi-modal collaboration|
|US9124762 *||Dec 20, 2012||Sep 1, 2015||Microsoft Technology Licensing, Llc||Privacy camera|
|US9257117 *||Feb 4, 2014||Feb 9, 2016||Avaya Inc.||Speech analytics with adaptive filtering|
|US20070165105 *||Jan 9, 2006||Jul 19, 2007||Lengeling Gerhard H J||Multimedia conference recording and manipulation interface|
|US20090318228 *||Jul 17, 2007||Dec 24, 2009||Sony Computer Entertainment Europe Limited||Apparatus and method of interaction with a data processor|
|US20140176663 *||Dec 20, 2012||Jun 26, 2014||Microsoft Corporation||Privacy camera|
|US20150097920 *||Dec 16, 2014||Apr 9, 2015||Sony Corporation||Information processing apparatus and information processing method|
|US20150221299 *||Feb 4, 2014||Aug 6, 2015||Avaya, Inc.||Speech analytics with adaptive filtering|
|US20160065895 *||Aug 19, 2015||Mar 3, 2016||Huawei Technologies Co., Ltd.||Method, apparatus, and system for presenting communication information in video communication|
|U.S. Classification||348/14.08, 348/14.06|
|International Classification||G06N99/00, H04N5/76, H04N5/225, G06T1/00, H04N7/15, G03B37/00, H04N9/73, H04N1/387, G06T3/00, H04N17/00, H04N5/265, H04N1/60, G06T7/00, G06T5/40|
|Cooperative Classification||H04N9/73, H04N5/23238, H04N7/15, H04N5/247, H04N1/3876, H04N7/147, H04N17/002, H04N1/6027, G06K9/00295, H04N7/155|
|European Classification||H04N5/232M, H04N7/15, H04N7/14A3, G06K9/00F3U, H04N17/00C, H04N1/60E, H04N1/387D, H04N5/247|
|May 16, 2005||AS||Assignment|
Owner name: MICROSOFT CORPORATION, WASHINGTON
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:CUTLER, ROSS G;REEL/FRAME:016018/0218
Effective date: 20050330
|Mar 18, 2013||FPAY||Fee payment|
Year of fee payment: 4
|Dec 9, 2014||AS||Assignment|
Owner name: MICROSOFT TECHNOLOGY LICENSING, LLC, WASHINGTON
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MICROSOFT CORPORATION;REEL/FRAME:034543/0001
Effective date: 20141014