A method and system summarizes scenes in a video sequence by detecting scene changes, and then comparing scenes in a moving window to determine their similarity. Similar scenes are consolidated and represented by a representative frame, a number of which are displayed to a user. Scene changes are detected by comparing average color histograms for each scene, motion compensated pixel differences or motion compensated edge maps, or a combination of these methods. Scenes in the video sequence are selected for summarizing according to their normalized time duration. Of the selected scenes, similar or related scenes are determined by comparing the average color histograms of each pair of scenes in a moving window, using a standard population error measure, such as a Chi-squared test. For each set of related scenes, a representative frame is taken, either as the medial frame from the entire time duration of the related scenes or as the first frame of the medial... |
Citations|
| US4270143 | Dec 20, 1978 | May 26, 1981 | General Electric Company | Cross-correlation video tracker and method | | US4931868 | May 31, 1988 | Jun 5, 1990 | Grumman Aerospace Corporation | Method and apparatus for detecting innovations in a scene | | US5049991 | Feb 20, 1990 | Sep 17, 1991 | Victor Company of Japan, Ltd. | Movement compensation predictive coding/decoding method | | US5404174 | Jun 28, 1993 | Apr 4, 1995 | Victor Company of Japan, Ltd. | Scene change detector for detecting a scene change of a moving picture | | US5459517 | Dec 14, 1993 | Oct 17, 1995 | Fuji Xerox Co., Ltd. | Moving picture scene detection system | | US5521841 | Mar 31, 1994 | May 28, 1996 | Siemens Corporate Research, Inc. | Browsing contents of a given video sequence | | US5537528 | Feb 16, 1993 | Jul 16, 1996 | International Business Machines Corporation | System and method for inputting scene information |
Referenced by|
| US6014183 | Aug 6, 1997 | Jan 11, 2000 | Imagine Products, Inc. | Method and apparatus for detecting scene changes in a digital video stream | | US6084569 | Jul 17, 1997 | Jul 4, 2000 | Avid Technology, Inc. | Editing interface | | US6271829 | Jun 30, 2000 | Aug 7, 2001 | Avid Technology, Inc. | Editing interface | | US6301302 | Jul 28, 2000 | Oct 9, 2001 | Matsushita Electric Industrial Co., Ltd. | Moving picture search system cross reference to related application | | US6363160 | Jan 22, 1999 | Mar 26, 2002 | Intel Corporation | Interface using pattern recognition and tracking | | US6396476 | Dec 1, 1998 | May 28, 2002 | Intel Corporation | Synthesizing computer input events | | US6429867 | Mar 15, 1999 | Aug 6, 2002 | Sun Microsystems, Inc. | System and method for generating and playback of three-dimensional movies | | US6466205 | Nov 19, 1998 | Oct 15, 2002 | Push Entertainment, Inc. | System and method for creating 3D models from 2D sequential image data | | US6473095 | Jul 16, 1998 | Oct 29, 2002 | Koninklijke Philips Electronics N.V. | Histogram method for characterizing video content | | US6538649 | Dec 1, 1998 | Mar 25, 2003 | Intel Corporation | Computer vision control variable transformation | | US6577805 | Nov 17, 1998 | Jun 10, 2003 | Sony Corporation | Picture recording and reproducing apparatus and method | | US6591010 | Jul 29, 1999 | Jul 8, 2003 | International Business Machines Corporation | System and method for image detection and qualification | | US6611653 | Jan 29, 1999 | Aug 26, 2003 | LG Electronics Inc. | Adaptive display speed automatic control device of motional video and method therefor | | US6636220 | May 30, 2000 | Oct 21, 2003 | Microsoft Corporation | Video-based rendering | | US6647131 | Aug 27, 1999 | Nov 11, 2003 | Intel Corporation | Motion detection using normal optical flow | | US6654483 | Dec 22, 1999 | Nov 25, 2003 | Intel Corporation | Motion detection using normal optical flow | | US6721361 | Feb 23, 2001 | Apr 13, 2004 | YesVideo.Com | Video processing system including advanced scene break detection methods for fades, dissolves and flashes | | US6772125 | Dec 4, 2001 | Aug 3, 2004 | Sony United Kingdom Limited | Audio/video reproducing apparatus and method | | US6807298 | Jan 28, 2000 | Oct 19, 2004 | Electronics and Telecommunications Research Institute | Method for generating a block-based image histogram | | US6810145 | Apr 4, 2001 | Oct 26, 2004 | Thomson Licensing, S.A. | Process for detecting a change of shot in a succession of video images | | US6882793 | Jun 16, 2000 | Apr 19, 2005 | YesVideo, Inc. | Video processing system | | US6931595 | Mar 30, 2001 | Aug 16, 2005 | Sharp Laboratories of America, Inc. | Method for automatic extraction of semantically significant events from video | | US6977963 | Feb 14, 2000 | Dec 20, 2005 | Canon Kabushiki Kaisha | Scene change detection method using two-dimensional DP matching, and image processing apparatus for implementing the method | | US6996171 | Jan 27, 2000 | Feb 7, 2006 | Sony Corporation | Data describing method and data processor | | US7006945 | Mar 31, 2003 | Feb 28, 2006 | Sharp Laboratories of America, Inc. | Processing of video content | | US7016540 | Apr 24, 2000 | Mar 21, 2006 | NEC Corporation | Method and system for segmentation, classification, and summarization of video images | | US7020351 | Oct 6, 2000 | Mar 28, 2006 | Sarnoff Corporation | Method and apparatus for enhancing and indexing video and audio signals | | US7043058 | Apr 20, 2001 | May 9, 2006 | Avid Technology, Inc. | Correcting motion vector maps for image processing | | US7047157 | Oct 30, 2004 | May 16, 2006 | Sharp Laboratories of America, Inc. | Processing of video content | | US7075683 | Feb 14, 2000 | Jul 11, 2006 | Canon Kabushiki Kaisha | Dynamic image digest automatic editing system and dynamic image digest automatic editing method | | US7092040 | Jun 30, 2000 | Aug 15, 2006 | Sharp Kabushiki Kaisha | Dynamic image search information recording apparatus and dynamic image searching device | | US7106900 | Jun 30, 2004 | Sep 12, 2006 | Electronics and Telecommunications Research Institute | Method for generating a block-based image histogram | | US7120873 | Jan 28, 2002 | Oct 10, 2006 | Sharp Laboratories of America, Inc. | Summarization of sumo video content | | US7130446 | Dec 3, 2001 | Oct 31, 2006 | Microsoft Corporation | Automatic detection and tracking of multiple individuals using multiple cues | | US7143354 | Aug 20, 2001 | Nov 28, 2006 | Sharp Laboratories of America, Inc. | Summarization of baseball video content | | US7149365 | Sep 11, 2002 | Dec 12, 2006 | Pioneer Corporation | Image information summary apparatus, image information summary method and image information summary processing program | | US7151843 | Jan 25, 2005 | Dec 19, 2006 | Microsoft Corporation | Automatic detection and tracking of multiple individuals using multiple cues | | US7167809 | Oct 30, 2004 | Jan 23, 2007 | Sharp Laboratories of America, Inc. | Processing of video content | | US7171025 | Jan 25, 2005 | Jan 30, 2007 | Microsoft Corporation | Automatic detection and tracking of multiple individuals using multiple cues | | US7199841 | Dec 24, 2002 | Apr 3, 2007 | LG Electronics Inc. | Apparatus for automatically generating video highlights and method thereof | | US7203620 | May 23, 2002 | Apr 10, 2007 | Sharp Laboratories of America, Inc. | Summarization of video content | | US7245824 | Mar 7, 2001 | Jul 17, 2007 | | Image indexing systems | | US7274864 | Jan 29, 2002 | Sep 25, 2007 | InterVideo Digital Technology Corporation | Method and device for digital video capture | | US7296231 | Aug 9, 2001 | Nov 13, 2007 | Eastman Kodak Company | Video structuring by probabilistic merging of video segments | | US7310589 | Oct 30, 2004 | Dec 18, 2007 | Sharp Laboratories of America, Inc. | Processing of video content | | US7312812 | Jan 5, 2005 | Dec 25, 2007 | Sharp Laboratories of America, Inc. | Summarization of football video content | | US7372991 | Sep 26, 2003 | May 13, 2008 | Seiko Epson Corporation | Method and apparatus for summarizing and indexing the contents of an audio-visual presentation | | US7424677 | Oct 28, 2004 | Sep 9, 2008 | Sharp Laboratories of America, Inc. | Audiovisual information management system with usage preferences | | US7424678 | Oct 28, 2004 | Sep 9, 2008 | Sharp Laboratories of America, Inc. | Audiovisual information management system with advertising | | US7428315 | Jan 25, 2005 | Sep 23, 2008 | Microsoft Corporation | Automatic detection and tracking of multiple individuals using multiple cues | | US7474331 | Jan 3, 2005 | Jan 6, 2009 | Sharp Laboratories of America, Inc. | Summarization of football video content | | US7474698 | Sep 27, 2002 | Jan 6, 2009 | Sharp Laboratories of America, Inc. | Identification of replay segments | | US7492935 | Jun 25, 2004 | Feb 17, 2009 | Given Imaging Ltd | Device, method, and system for reduced transmission imaging | | US7493014 | Dec 6, 2001 | Feb 17, 2009 | Sony United Kingdom Limited | Replaying video information | | US7499077 | Aug 20, 2001 | Mar 3, 2009 | Sharp Laboratories of America, Inc. | Summarization of football video content | | US7509580 | Oct 28, 2004 | Mar 24, 2009 | Sharp Laboratories of America, Inc. | Audiovisual information management system with preferences descriptions | | US7594245 | Jun 13, 2005 | Sep 22, 2009 | Sharp Laboratories of America, Inc. | Networked video devices | | US7599554 | Apr 2, 2004 | Oct 6, 2009 | Koninklijke Philips Electronics N.V. | Method and apparatus for summarizing a music video using content analysis | | US7639275 | Jan 3, 2005 | Dec 29, 2009 | Sharp Laboratories of America, Inc. | Summarization of football video content | | US7653131 | Dec 2, 2005 | Jan 26, 2010 | Sharp Laboratories of America, Inc. | Identification of replay segments | | US7657836 | Sep 27, 2002 | Feb 2, 2010 | Sharp Laboratories of America, Inc. | Summarization of soccer video content | | US7657907 | Sep 30, 2002 | Feb 2, 2010 | Sharp Laboratories of America, Inc. | Automatic user profiling | | US7668438 | Feb 7, 2005 | Feb 23, 2010 | YesVideo, Inc. | Video processing system | | US7697754 | Feb 8, 2006 | Apr 13, 2010 | Electronics and Telecommunications Research Institute | Method for generating a block-based image histogram | | US7702014 | Dec 16, 1999 | Apr 20, 2010 | Muvee Technologies Pte. Ltd. | System and method for video production | | US7793205 | Jul 8, 2005 | Sep 7, 2010 | Sharp Laboratories of America, Inc. | Synchronization of video and data | | US7853865 | Jul 8, 2005 | Dec 14, 2010 | Sharp Laboratories of America, Inc. | Synchronization of video and data | | US7880936 | Mar 27, 2006 | Feb 1, 2011 | Canon Kabushiki Kaisha | Dynamic image digest automatic editing system and dynamic image digest automatic editing method | | US7884884 | Mar 22, 2006 | Feb 8, 2011 | Sharp Kabushiki Kaisha | Dynamic image search information recording apparatus and dynamic image searching devices | | US7890331 | May 17, 2004 | Feb 15, 2011 | Koninklijke Philips Electronics N.V. | System and method for generating audio-visual summaries for audio-visual program content | | US7904814 | Dec 13, 2001 | Mar 8, 2011 | Sharp Laboratories of America, Inc. | System for presenting audio-video content | | US7995116 | Mar 22, 2010 | Aug 9, 2011 | Eastman Kodak Company | Varying camera self-determination based on subject motion | | US7996878 | Aug 29, 2000 | Aug 9, 2011 | AT&T Intellectual Property II, L.P. | System and method for generating coded video sequences from still media | | US8006186 | Dec 22, 2000 | Aug 23, 2011 | Muvee Technologies Pte. Ltd. | System and method for media production | | US8018491 | Jan 3, 2005 | Sep 13, 2011 | Sharp Laboratories of America, Inc. | Summarization of football video content | | US8020183 | Mar 30, 2001 | Sep 13, 2011 | Sharp Laboratories of America, Inc. | Audiovisual management system | | US8028234 | Mar 8, 2005 | Sep 27, 2011 | Sharp Laboratories of America, Inc. | Summarization of sumo video content | | US8028314 | May 26, 2000 | Sep 27, 2011 | Sharp Laboratories of America, Inc. | Audiovisual information management system | | US8089563 | Jan 3, 2006 | Jan 3, 2012 | Fuji Xerox Co., Ltd. | Method and system for analyzing fixed-camera video via the selection, visualization, and interaction with storyboard keyframes | | US8176426 | Sep 27, 2005 | May 8, 2012 | Nikon Corporation | Image reproduction apparatus and image reproduction program product | | US8214741 | May 22, 2002 | Jul 3, 2012 | Sharp Laboratories of America, Inc. | Synchronization of video and data | | US8243203 | Feb 21, 2007 | Aug 14, 2012 | LG Electronics Inc. | Apparatus for automatically generating video highlights and method thereof | | USRE41939 | Aug 3, 2006 | Nov 16, 2010 | Sony United Kingdom Limited | Audio/video reproducing apparatus and method |
Claims1. A method of summarizing a temporally ordered plurality of scenes in a video sequence including a plurality of frames, comprising the steps of: - detecting at least one scene in the plurality of frames, each scene including at least one related frame by:
- determining for each frame at least one global measurement; and
- determining a scene change between a pair of successive frames from a difference between the global measurements of each frame in the pair of frames, a Chi-squared value on a color histogram of each frame in the pair of frames, and a normalized motion compensated pixel difference between the frames;
- determining at least one set of related scenes;
- determining a representative frame for each set of related scenes; and
- displaying each representative frame.
2. The method of claim 1, wherein the determination of the scene change is performed using a neural network receiving the difference between the global measurements, the Chi-squared value, and the normalized motion compensated pixel difference. 3. The method of claim 1, wherein the step of detecting at least one scene further comprises the steps of: - for each frame of the video sequence, beginning with a first frame, determining a motion compensated pixel luminance difference between a current frame and a next frame; and
- responsive to the motion compensated pixel luminance difference being greater than a threshold value, associating the first frame with all frames up to and including the current frame to form a scene.
4. The method of claim 3, further comprising the step of: - equalizing a color histogram of each frame of the video sequence prior to determining the motion compensated pixel luminance difference.
5. The method of claim 1, wherein the step of detecting at least one scene further comprises the steps of: - for each frame of the video sequence, beginning with a first frame, determining a motion compensated edge map difference between a current frame and a next frame; and
- responsive to the motion compensated edge map difference being greater than a threshold value, associating the first frame with all frames up to and including the current frame to form a scene.
6. The method of claim 1, wherein the step of detecting at least one scene further comprises the steps of: - for each frame of the video sequence, beginning with a first frame, determining a standard error of differences between a color histogram for a current frame and a color histogram of a next frame; and
- responsive to the standard error being greater than a threshold value, associating the first frame with all frames up to and including the current frame to form a scene.
7. The method of claim 9, wherein the standard error is measured using a Chi-squared test. 8. The method of claim 1, wherein the step determining at least one set of related scenes further comprises the steps of: - determining for each scene a summary measure derived from all frames in the scene;
- comparing the summary measure of each scene with the summary measures of selected other scenes using a standard error measure; and
- identifying at least one set of scenes having approximately equal summary measures.
9. The method of claim 8 wherein the summary measure is an average color histogram for all frames in the scene. 10. The method of claim 8, further comprising the step of: - linking scenes in each group of scenes into a consolidated scene for display as a single continuous scene.
11. The method of claim 8, wherein the representative frame for each set of related scenes is a frame temporally medial to a first frame and a last frame in the set of related scenes. 12. The method of claim 8, wherein the representative frame for each set of related scenes is selected from a longest scene in the set of related scenes. 13. A method of removing redundant scenes in a plurality of frames, each scene including at least one frame, the method comprising the steps of: - detecting each scene in the plurality of frames, each scene including at least one related frame by:
- determining for each frame at least one global measurement; and
- determining a scene change between a pair of successive frames from a difference between the global measurements of each frame in the pair of frames, a Chi-squared value on a color histogram of each frame in the pair of frames, and a normalized motion compensated pixel difference between the frames;
- determining for each scene, an average color histogram for all frames included the scene; and
- comparing the average color histogram of each scene with the average color histogram of each of a selected number of subsequent scenes to identify scenes within the selected number of scenes having substantially similar average color histograms, each set of scenes having substantially similar average color histograms forming a set of related scenes.
14. The method of claim 13 wherein the step of comparing is performing by determining a Chi squared value for the average color histograms of each pair of scenes. 15. A method of removing redundant scenes in a plurality of frames, each scene including at least one frame, the method comprising the steps of: - detecting each scene in the plurality of frames, each scene including at least one related frame by:
- determining for each frame at least one global measurement; and
- determining a scene change between a pair of successive frames from a difference between the global measurements of each frame in the pair of frames, a Chi-squared value on a color histogram of each frame in the pair of frames, and a normalized motion compensated pixel difference between the frames;
- determining for each scene a summary measure derived from all frames in the scene; and
- comparing the summary measure of each scene with the summary measure of selected subsequent scenes using a standard error measure to identify at least one group of scenes having substantially similar summary measures.
16. The method of claim 15 wherein the summary measure is an average color histogram. 17. The method of claim 15, wherein the selected other scenes consists of a predetermined number of subsequent scenes. 18. The method of claim 15, further comprising the step of: - linking scenes in each group of scenes into a consolidated scene.
19. The method of claim 15, wherein the representative frame for each set of related scenes a frame temporally medial to a first frame and a last frame in the set of related scenes. 20. The method of claim 1, further comprising the steps of: - displaying a plurality of slices of frames as a rectangular prism; and
- highlighting in the plurality of slices, the slices of frames either a first frame or a last frame of each scene.
21. The method of claim 20, further comprising the step of - retrieving in response to user selection of any frame, the scene including the selected frame.
22. The method of claim 20, further comprising the steps of: - determining at least one scene having a time duration in excess of a threshold;
- displaying for the scene, and within the rectangular prism at the location of the scene, a compressed representation of all frames in the scene; and
- responsive to a user selection of the compressed representation, displaying a plurality of slices for all frames in the scene.
23. A system for summarizing a temporally ordered plurality of scenes in a video sequence including a plurality of frames, comprising: - an input device for receiving an input video sequence;
- a scene change detector coupled to the input device, the scene change detector detecting at least one scene in the plurality of frames, each scene including at least one related frame, the scene change detector detecting a scene change between a pair of successive frames from a difference between a global measurement of each frame in the pair of frames, a Chi-squared value on a color histogram of each frame in the pair of frames, and a normalized motion compensated pixel difference between the frames;
- a related scene detector, coupled to the scene change detector to receive a set of scenes, and determining at least one set of related scenes, the related scene detector determining a representative frame for each set of related scenes;
- a display for displaying the representative frame, and selected scenes; and
- a user interface display driver coupled to the related scene detector to receive the representative frames, and displaying each representative frame on the display.
24. The system of claim 23, further comprising: - a movie bar generator coupled to receive the scenes from the scene change detector, and the related scenes from the related scene detector, the movie bar generator creating a slice of each frames in each scene, and displaying the slices as rectangular prism.
|