Search Images Maps Play YouTube News Gmail Drive More »
Advanced Patent Search | Web History | Sign in

Patents

A method and system summarizes scenes in a video sequence by detecting scene changes, and then comparing scenes in a moving window to determine their similarity. Similar scenes are consolidated and represented by a representative frame, a number of which are displayed to a user. Scene changes are detected by comparing average color histograms for each scene, motion compensated pixel differences or motion compensated edge maps, or a combination of these methods. Scenes in the video sequence are selected for summarizing according to their normalized time duration. Of the selected scenes, similar or related scenes are determined by comparing the average color histograms of each pair of scenes in a moving window, using a standard population error measure, such as a Chi-squared test. For each set of related scenes, a representative frame is taken, either as the medial frame from the entire time duration of the related scenes or as the first frame of the medial...

InventorsKatherine Wang, James Normile
Original AssigneeApple Computer, Inc.
Primary Examiner: Monica S. Davis
Current U.S. Classification382/232; 382/236; 707/E17.028
International Classification: G06K 936; G06K 946; G06K 1500; G06F 1500

View patent at USPTO
Search USPTO Assignment Database

Citations

Cited PatentFiling dateIssue dateOriginal AssigneeTitle
US4270143Dec 20, 1978May 26, 1981General Electric CompanyCross-correlation video tracker and method
US4931868May 31, 1988Jun 5, 1990Grumman Aerospace CorporationMethod and apparatus for detecting innovations in a scene
US5049991Feb 20, 1990Sep 17, 1991Victor Company of Japan, Ltd.Movement compensation predictive coding/decoding method
US5404174Jun 28, 1993Apr 4, 1995Victor Company of Japan, Ltd.Scene change detector for detecting a scene change of a moving picture
US5459517Dec 14, 1993Oct 17, 1995Fuji Xerox Co., Ltd.Moving picture scene detection system
US5521841Mar 31, 1994May 28, 1996Siemens Corporate Research, Inc.Browsing contents of a given video sequence
US5537528Feb 16, 1993Jul 16, 1996International Business Machines CorporationSystem and method for inputting scene information

Referenced by

Citing PatentFiling dateIssue dateOriginal AssigneeTitle
US6014183Aug 6, 1997Jan 11, 2000Imagine Products, Inc.Method and apparatus for detecting scene changes in a digital video stream
US6084569Jul 17, 1997Jul 4, 2000Avid Technology, Inc.Editing interface
US6271829Jun 30, 2000Aug 7, 2001Avid Technology, Inc.Editing interface
US6301302Jul 28, 2000Oct 9, 2001Matsushita Electric Industrial Co., Ltd.Moving picture search system cross reference to related application
US6363160Jan 22, 1999Mar 26, 2002Intel CorporationInterface using pattern recognition and tracking
US6396476Dec 1, 1998May 28, 2002Intel CorporationSynthesizing computer input events
US6429867Mar 15, 1999Aug 6, 2002Sun Microsystems, Inc.System and method for generating and playback of three-dimensional movies
US6466205Nov 19, 1998Oct 15, 2002Push Entertainment, Inc.System and method for creating 3D models from 2D sequential image data
US6473095Jul 16, 1998Oct 29, 2002Koninklijke Philips Electronics N.V.Histogram method for characterizing video content
US6538649Dec 1, 1998Mar 25, 2003Intel CorporationComputer vision control variable transformation
US6577805Nov 17, 1998Jun 10, 2003Sony CorporationPicture recording and reproducing apparatus and method
US6591010Jul 29, 1999Jul 8, 2003International Business Machines CorporationSystem and method for image detection and qualification
US6611653Jan 29, 1999Aug 26, 2003LG Electronics Inc.Adaptive display speed automatic control device of motional video and method therefor
US6636220May 30, 2000Oct 21, 2003Microsoft CorporationVideo-based rendering
US6647131Aug 27, 1999Nov 11, 2003Intel CorporationMotion detection using normal optical flow
US6654483Dec 22, 1999Nov 25, 2003Intel CorporationMotion detection using normal optical flow
US6721361Feb 23, 2001Apr 13, 2004YesVideo.ComVideo processing system including advanced scene break detection methods for fades, dissolves and flashes
US6772125Dec 4, 2001Aug 3, 2004Sony United Kingdom LimitedAudio/video reproducing apparatus and method
US6807298Jan 28, 2000Oct 19, 2004Electronics and Telecommunications Research InstituteMethod for generating a block-based image histogram
US6810145Apr 4, 2001Oct 26, 2004Thomson Licensing, S.A.Process for detecting a change of shot in a succession of video images
US6882793Jun 16, 2000Apr 19, 2005YesVideo, Inc.Video processing system
US6931595Mar 30, 2001Aug 16, 2005Sharp Laboratories of America, Inc.Method for automatic extraction of semantically significant events from video
US6977963Feb 14, 2000Dec 20, 2005Canon Kabushiki KaishaScene change detection method using two-dimensional DP matching, and image processing apparatus for implementing the method
US6996171Jan 27, 2000Feb 7, 2006Sony CorporationData describing method and data processor
US7006945Mar 31, 2003Feb 28, 2006Sharp Laboratories of America, Inc.Processing of video content
US7016540Apr 24, 2000Mar 21, 2006NEC CorporationMethod and system for segmentation, classification, and summarization of video images
US7020351Oct 6, 2000Mar 28, 2006Sarnoff CorporationMethod and apparatus for enhancing and indexing video and audio signals
US7043058Apr 20, 2001May 9, 2006Avid Technology, Inc.Correcting motion vector maps for image processing
US7047157Oct 30, 2004May 16, 2006Sharp Laboratories of America, Inc.Processing of video content
US7075683Feb 14, 2000Jul 11, 2006Canon Kabushiki KaishaDynamic image digest automatic editing system and dynamic image digest automatic editing method
US7092040Jun 30, 2000Aug 15, 2006Sharp Kabushiki KaishaDynamic image search information recording apparatus and dynamic image searching device
US7106900Jun 30, 2004Sep 12, 2006Electronics and Telecommunications Research InstituteMethod for generating a block-based image histogram
US7120873Jan 28, 2002Oct 10, 2006Sharp Laboratories of America, Inc.Summarization of sumo video content
US7130446Dec 3, 2001Oct 31, 2006Microsoft CorporationAutomatic detection and tracking of multiple individuals using multiple cues
US7143354Aug 20, 2001Nov 28, 2006Sharp Laboratories of America, Inc.Summarization of baseball video content
US7149365Sep 11, 2002Dec 12, 2006Pioneer CorporationImage information summary apparatus, image information summary method and image information summary processing program
US7151843Jan 25, 2005Dec 19, 2006Microsoft CorporationAutomatic detection and tracking of multiple individuals using multiple cues
US7167809Oct 30, 2004Jan 23, 2007Sharp Laboratories of America, Inc.Processing of video content
US7171025Jan 25, 2005Jan 30, 2007Microsoft CorporationAutomatic detection and tracking of multiple individuals using multiple cues
US7199841Dec 24, 2002Apr 3, 2007LG Electronics Inc.Apparatus for automatically generating video highlights and method thereof
US7203620May 23, 2002Apr 10, 2007Sharp Laboratories of America, Inc.Summarization of video content
US7245824Mar 7, 2001Jul 17, 2007Image indexing systems
US7274864Jan 29, 2002Sep 25, 2007InterVideo Digital Technology CorporationMethod and device for digital video capture
US7296231Aug 9, 2001Nov 13, 2007Eastman Kodak CompanyVideo structuring by probabilistic merging of video segments
US7310589Oct 30, 2004Dec 18, 2007Sharp Laboratories of America, Inc.Processing of video content
US7312812Jan 5, 2005Dec 25, 2007Sharp Laboratories of America, Inc.Summarization of football video content
US7372991Sep 26, 2003May 13, 2008Seiko Epson CorporationMethod and apparatus for summarizing and indexing the contents of an audio-visual presentation
US7424677Oct 28, 2004Sep 9, 2008Sharp Laboratories of America, Inc.Audiovisual information management system with usage preferences
US7424678Oct 28, 2004Sep 9, 2008Sharp Laboratories of America, Inc.Audiovisual information management system with advertising
US7428315Jan 25, 2005Sep 23, 2008Microsoft CorporationAutomatic detection and tracking of multiple individuals using multiple cues
US7474331Jan 3, 2005Jan 6, 2009Sharp Laboratories of America, Inc.Summarization of football video content
US7474698Sep 27, 2002Jan 6, 2009Sharp Laboratories of America, Inc.Identification of replay segments
US7492935Jun 25, 2004Feb 17, 2009Given Imaging LtdDevice, method, and system for reduced transmission imaging
US7493014Dec 6, 2001Feb 17, 2009Sony United Kingdom LimitedReplaying video information
US7499077Aug 20, 2001Mar 3, 2009Sharp Laboratories of America, Inc.Summarization of football video content
US7509580Oct 28, 2004Mar 24, 2009Sharp Laboratories of America, Inc.Audiovisual information management system with preferences descriptions
US7594245Jun 13, 2005Sep 22, 2009Sharp Laboratories of America, Inc.Networked video devices
US7599554Apr 2, 2004Oct 6, 2009Koninklijke Philips Electronics N.V.Method and apparatus for summarizing a music video using content analysis
US7639275Jan 3, 2005Dec 29, 2009Sharp Laboratories of America, Inc.Summarization of football video content
US7653131Dec 2, 2005Jan 26, 2010Sharp Laboratories of America, Inc.Identification of replay segments
US7657836Sep 27, 2002Feb 2, 2010Sharp Laboratories of America, Inc.Summarization of soccer video content
US7657907Sep 30, 2002Feb 2, 2010Sharp Laboratories of America, Inc.Automatic user profiling
US7668438Feb 7, 2005Feb 23, 2010YesVideo, Inc.Video processing system
US7697754Feb 8, 2006Apr 13, 2010Electronics and Telecommunications Research InstituteMethod for generating a block-based image histogram
US7702014Dec 16, 1999Apr 20, 2010Muvee Technologies Pte. Ltd.System and method for video production
US7793205Jul 8, 2005Sep 7, 2010Sharp Laboratories of America, Inc.Synchronization of video and data
US7853865Jul 8, 2005Dec 14, 2010Sharp Laboratories of America, Inc.Synchronization of video and data
US7880936Mar 27, 2006Feb 1, 2011Canon Kabushiki KaishaDynamic image digest automatic editing system and dynamic image digest automatic editing method
US7884884Mar 22, 2006Feb 8, 2011Sharp Kabushiki KaishaDynamic image search information recording apparatus and dynamic image searching devices
US7890331May 17, 2004Feb 15, 2011Koninklijke Philips Electronics N.V.System and method for generating audio-visual summaries for audio-visual program content
US7904814Dec 13, 2001Mar 8, 2011Sharp Laboratories of America, Inc.System for presenting audio-video content
US7995116Mar 22, 2010Aug 9, 2011Eastman Kodak CompanyVarying camera self-determination based on subject motion
US7996878Aug 29, 2000Aug 9, 2011AT&T Intellectual Property II, L.P.System and method for generating coded video sequences from still media
US8006186Dec 22, 2000Aug 23, 2011Muvee Technologies Pte. Ltd.System and method for media production
US8018491Jan 3, 2005Sep 13, 2011Sharp Laboratories of America, Inc.Summarization of football video content
US8020183Mar 30, 2001Sep 13, 2011Sharp Laboratories of America, Inc.Audiovisual management system
US8028234Mar 8, 2005Sep 27, 2011Sharp Laboratories of America, Inc.Summarization of sumo video content
US8028314May 26, 2000Sep 27, 2011Sharp Laboratories of America, Inc.Audiovisual information management system
US8089563Jan 3, 2006Jan 3, 2012Fuji Xerox Co., Ltd.Method and system for analyzing fixed-camera video via the selection, visualization, and interaction with storyboard keyframes
US8176426Sep 27, 2005May 8, 2012Nikon CorporationImage reproduction apparatus and image reproduction program product
US8214741May 22, 2002Jul 3, 2012Sharp Laboratories of America, Inc.Synchronization of video and data
US8243203Feb 21, 2007Aug 14, 2012LG Electronics Inc.Apparatus for automatically generating video highlights and method thereof
USRE41939Aug 3, 2006Nov 16, 2010Sony United Kingdom LimitedAudio/video reproducing apparatus and method

Claims

1. A method of summarizing a temporally ordered plurality of scenes in a video sequence including a plurality of frames, comprising the steps of:

detecting at least one scene in the plurality of frames, each scene including at least one related frame by:
determining for each frame at least one global measurement; and
determining a scene change between a pair of successive frames from a difference between the global measurements of each frame in the pair of frames, a Chi-squared value on a color histogram of each frame in the pair of frames, and a normalized motion compensated pixel difference between the frames;
determining at least one set of related scenes;
determining a representative frame for each set of related scenes; and
displaying each representative frame.

2. The method of claim 1, wherein the determination of the scene change is performed using a neural network receiving the difference between the global measurements, the Chi-squared value, and the normalized motion compensated pixel difference.

3. The method of claim 1, wherein the step of detecting at least one scene further comprises the steps of:

for each frame of the video sequence, beginning with a first frame, determining a motion compensated pixel luminance difference between a current frame and a next frame; and
responsive to the motion compensated pixel luminance difference being greater than a threshold value, associating the first frame with all frames up to and including the current frame to form a scene.

4. The method of claim 3, further comprising the step of:

equalizing a color histogram of each frame of the video sequence prior to determining the motion compensated pixel luminance difference.

5. The method of claim 1, wherein the step of detecting at least one scene further comprises the steps of:

for each frame of the video sequence, beginning with a first frame, determining a motion compensated edge map difference between a current frame and a next frame; and
responsive to the motion compensated edge map difference being greater than a threshold value, associating the first frame with all frames up to and including the current frame to form a scene.

6. The method of claim 1, wherein the step of detecting at least one scene further comprises the steps of:

for each frame of the video sequence, beginning with a first frame, determining a standard error of differences between a color histogram for a current frame and a color histogram of a next frame; and
responsive to the standard error being greater than a threshold value, associating the first frame with all frames up to and including the current frame to form a scene.

7. The method of claim 9, wherein the standard error is measured using a Chi-squared test.

8. The method of claim 1, wherein the step determining at least one set of related scenes further comprises the steps of:

determining for each scene a summary measure derived from all frames in the scene;
comparing the summary measure of each scene with the summary measures of selected other scenes using a standard error measure; and
identifying at least one set of scenes having approximately equal summary measures.

9. The method of claim 8 wherein the summary measure is an average color histogram for all frames in the scene.

10. The method of claim 8, further comprising the step of:

linking scenes in each group of scenes into a consolidated scene for display as a single continuous scene.

11. The method of claim 8, wherein the representative frame for each set of related scenes is a frame temporally medial to a first frame and a last frame in the set of related scenes.

12. The method of claim 8, wherein the representative frame for each set of related scenes is selected from a longest scene in the set of related scenes.

13. A method of removing redundant scenes in a plurality of frames, each scene including at least one frame, the method comprising the steps of:

detecting each scene in the plurality of frames, each scene including at least one related frame by:
determining for each frame at least one global measurement; and
determining a scene change between a pair of successive frames from a difference between the global measurements of each frame in the pair of frames, a Chi-squared value on a color histogram of each frame in the pair of frames, and a normalized motion compensated pixel difference between the frames;
determining for each scene, an average color histogram for all frames included the scene; and
comparing the average color histogram of each scene with the average color histogram of each of a selected number of subsequent scenes to identify scenes within the selected number of scenes having substantially similar average color histograms, each set of scenes having substantially similar average color histograms forming a set of related scenes.

14. The method of claim 13 wherein the step of comparing is performing by determining a Chi squared value for the average color histograms of each pair of scenes.

15. A method of removing redundant scenes in a plurality of frames, each scene including at least one frame, the method comprising the steps of:

detecting each scene in the plurality of frames, each scene including at least one related frame by:
determining for each frame at least one global measurement; and
determining a scene change between a pair of successive frames from a difference between the global measurements of each frame in the pair of frames, a Chi-squared value on a color histogram of each frame in the pair of frames, and a normalized motion compensated pixel difference between the frames;
determining for each scene a summary measure derived from all frames in the scene; and
comparing the summary measure of each scene with the summary measure of selected subsequent scenes using a standard error measure to identify at least one group of scenes having substantially similar summary measures.

16. The method of claim 15 wherein the summary measure is an average color histogram.

17. The method of claim 15, wherein the selected other scenes consists of a predetermined number of subsequent scenes.

18. The method of claim 15, further comprising the step of:

linking scenes in each group of scenes into a consolidated scene.

19. The method of claim 15, wherein the representative frame for each set of related scenes a frame temporally medial to a first frame and a last frame in the set of related scenes.

20. The method of claim 1, further comprising the steps of:

displaying a plurality of slices of frames as a rectangular prism; and
highlighting in the plurality of slices, the slices of frames either a first frame or a last frame of each scene.

21. The method of claim 20, further comprising the step of

retrieving in response to user selection of any frame, the scene including the selected frame.

22. The method of claim 20, further comprising the steps of:

determining at least one scene having a time duration in excess of a threshold;
displaying for the scene, and within the rectangular prism at the location of the scene, a compressed representation of all frames in the scene; and
responsive to a user selection of the compressed representation, displaying a plurality of slices for all frames in the scene.

23. A system for summarizing a temporally ordered plurality of scenes in a video sequence including a plurality of frames, comprising:

an input device for receiving an input video sequence;
a scene change detector coupled to the input device, the scene change detector detecting at least one scene in the plurality of frames, each scene including at least one related frame, the scene change detector detecting a scene change between a pair of successive frames from a difference between a global measurement of each frame in the pair of frames, a Chi-squared value on a color histogram of each frame in the pair of frames, and a normalized motion compensated pixel difference between the frames;
a related scene detector, coupled to the scene change detector to receive a set of scenes, and determining at least one set of related scenes, the related scene detector determining a representative frame for each set of related scenes;
a display for displaying the representative frame, and selected scenes; and
a user interface display driver coupled to the related scene detector to receive the representative frames, and displaying each representative frame on the display.

24. The system of claim 23, further comprising:

a movie bar generator coupled to receive the scenes from the scene change detector, and the related scenes from the related scene detector, the movie bar generator creating a slice of each frames in each scene, and displaying the slices as rectangular prism.