|Publication number||US6859557 B1|
|Application number||US 09/611,649|
|Publication date||Feb 22, 2005|
|Filing date||Jul 7, 2000|
|Priority date||Jul 7, 2000|
|Publication number||09611649, 611649, US 6859557 B1, US 6859557B1, US-B1-6859557, US6859557 B1, US6859557B1|
|Inventors||Matthew T. Uyttendaele, Richard S. Szeliski|
|Original Assignee||Microsoft Corp.|
|Export Citation||BiBTeX, EndNote, RefMan|
|Patent Citations (7), Non-Patent Citations (3), Referenced by (6), Classifications (31), Legal Events (5)|
|External Links: USPTO, USPTO Assignment, Espacenet|
1. Technical Field
This invention is directed towards a system and process for selectively decoding and decompressing portions of the frames of a panoramic video.
2. Background Art
Panoramic video is constructed of a sequence of frames, each of which depicts a 360 degree view of the surrounding scene or some significant portion thereof. These frames are played in a panoramic video viewer in sequence to produce a video of the scene. A panoramic video viewer may allow a user to navigate through the scene by processing user commands to pan right, left, up or down. In other words, a person viewing the panoramic video can electronically steer his or her viewpoint around in the scene as the video is playing. Such a panoramic viewer is the subject of a co-pending application entitled “A System and Process for Viewing Panoramic Video”, which has the same inventors as this application and which is assigned to a common assignee. The co-pending application was filed on Jul. 7, 2000 and assigned Ser. No. 09/611,987, and is now U.S. Pat. No. 6,559,846. The disclosure of this patent is hereby incorporated by reference.
As can be envisioned from the above discussion of viewing a frame of a panoramic video, only a portion of the overall image is displayed to the user. Thus, much of the overall frame is not viewed at any one time. However, typically the entire panoramic video frame is input and processed by the viewer.
In general, the transmission and storage of panoramic view frames present difficulties due to the amount of information they contain. In the case where these frames are transferred to the viewer over a network, such as the Internet, they will typically be compressed in some way. Unfortunately, even in a compressed form these frames represent a considerable amount of data, and so present problems in transmitting them to the viewer, as well as processing and storing them once received. These large files are slow to be transferred to a viewer. Additionally, to process this image data in real-time requires large amounts of Random Access Memory (RAM) as well as large powerful processors. Even in a case where the frames are input to the viewer directly from a storage medium, such as a hard drive, CD, DVD, or the like, their size, especially if not compressed, imposes considerable storage and processing requirements on the viewer.
The present invention overcomes the aforementioned limitations with a system and process that segments the panoramic video frames, thereby allowing selective decoding of just those specific regions that are to be viewed. Specifically, each frame is segmented into a plurality of regions. The frames are segmented in the same way such that the segmented regions correspond from one frame to the next. Each segmented region is then optionally compressed and encoded separately. Thus, separate video streams are generated for each of the segmented regions of each panoramic video frame.
Once the panoramic video frames have been segmented, compressed (if desired), and encoded, they are ready for transfer to the viewer. This can be accomplished in a number of ways, each with particular advantages. One way to transfer the frames involves an interactive approach. Essentially, the viewer, such as the one described in the aforementioned co-pending application, identifies the portions of the scene the user is currently viewing. In the case of a network connection, the viewer then informs a server of the segments of the next frame of the video that are needed to render the desired view of the scene to the user. The server then transfers only the requested segments of the next panoramic video frame to the viewer. This process is repeated for each frame of the panoramic video.
This foregoing interactive embodiment has the advantages of preserving the bandwidth utilized when sending data from the server to the viewer since only the data actually used by the viewer is transmitted. In addition, the processing and storage requirements of the viewer are minimized, as only those portions of each frame that are needed have to be decoded, decompressed and stored.
Of course, in some circumstances an interactive approach will not be desired or possible. In such cases, the system and process of the present invention still has advantages. Granted, all the segmented regions of each panoramic frame must be sent to the viewer as there is no feedback as to which regions are needed. However, once received, the viewer can selectively process and decompress (if necessary) only those segments required to display the portion of the scene currently being viewed by the user. Thus, the processing and storage requirements of the viewer are minimized.
This panoramic video segmentation technique according to the present invention also has some of the same advantages when employed with a direct connection between the viewer and some type of storage media. Specifically, the segmented panoramic video frames are stored on a storage medium (e.g., hard drive, CD, DVD) to which the viewer has direct access. Thus, the viewer can determine which segments of each panoramic video frame are needed to produce the desired view to the user, and reads only these segments from the storage medium. In this way the processing and storage requirements of the viewer are minimized.
In regard to the encoding of the panoramic frame segments, each frame segment is appended with an identifier that identifies what frame and what frame segment “location” (i.e., what region of the panoramic frame) the accompanying image data relates to. A separate file can be created for each video stream corresponding to a certain frame segment region. Altemately, one file could be created for all segment regions, with separate frames and frame segments being identified by the aforementioned identifiers.
The specific features, aspects, and advantages of the present invention will become better understood with regard to the following description, appended claims and accompanying drawings where:
In the following description of the preferred embodiments of the present invention, reference is made to the accompanying drawings, which form a part hereof, and which is shown by way of illustration of specific embodiments in which the invention may be practiced. It is understood that other embodiments may be utilized and structural changes may be made without departing from the scope of the present invention.
Exemplary Operating Environment
The invention is operational with numerous other general purpose or special purpose computing system environments or configurations. Examples of well known computing systems, environments, and/or configurations that may be suitable for use with the invention include, but are not limited to, personal computers, server computers, hand-held or laptop devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
The invention may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
With reference to
Computer 110 typically includes a variety of computer readable media. Computer readable media can be any available media that can be accessed by computer 110 and includes both volatile and nonvolatile media, removable and non-removable media. By way of example, and not limitation, computer readable media may comprise computer storage media and communication media. Computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can accessed by computer 110. Communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. The term “modulated data signal” means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared and other wireless media. Combinations of the any of the above should also be included within the scope of computer readable media.
The system memory 130 includes computer storage media in the form of volatile and/or nonvolatile memory such as read only memory (ROM) 131 and random access memory (RAM) 132. A basic input/output system 133 (BIOS), containing the basic routines that help to transfer information between elements within computer 110, such as during start-up, is typically stored in ROM 131. RAM 132 typically contains data and/or program modules that are immediately accessible to and/or presently being operated on by processing unit 120. By way of example, and not limitation,
The computer 110 may also include other removable/non-removable, volatile/nonvolatile computer storage media. By way of example only,
The drives and their associated computer storage media discussed above and illustrated in
The computer 110 may operate in a networked environment using logical connections to one or more remote computers, such as a remote computer 180. The remote computer 180 may be a personal computer, a server, a router, a network PC, a peer device or other common network node, and typically includes many or all of the elements described above relative to the computer 110, although only a memory storage device 181 has been illustrated in FIG. 1. The logical connections depicted in
When used in a LAN networking environment, the computer 110 is connected to the LAN 171 through a network interface or adapter 170. When used in a WAN networking environment, the computer 110 typically includes a modem 172 or other means for establishing communications over the WAN 173, such as the Internet. The modem 172, which may be internal or external, may be connected to the system bus 121 via the user input interface 160, or other appropriate mechanism. In a networked environment, program modules depicted relative to the computer 110, or portions thereof, may be stored in the remote memory storage device. By way of example, and not limitation,
The exemplary operating environment having now been discussed, the remaining parts of this description section will be devoted to a description of the program modules embodying the invention.
Frame Segmenting, Encoding and Compression
As shown in
In regard to the segmenting of each frame of a panoramic video, it is noted that any segmentation pattern can be employed as long as the same pattern is used for each frame. For example, the panoramic video frame depicted in
In regard to the encoding of the panoramic frame segments, each frame segment is appended with an identifier. This identifier at a minimum identifies what frame and what frame segment “location” (i.e., what region of the panoramic frame) the accompanying image data relates to. A separate file can be created for each video stream corresponding to a certain frame segment region. Alternately, one file could be created for all segment regions, with separate frames and frame segments being identified by the aforementioned identifiers.
In regard to compressing the panoramic frame segments, a common way of compressing video for transfer over a network is by using the MPEG4 compression method. In very general terms, MPEG compression works by comparing successive frames in a series of frames looking for items that are similar in these frames. Once the similar items have been identified, the MPEG compression algorithm transmits the similar items only once and subsequently only sends the differences (or non-similar items) identified. While it would not be possible to use this compression method to first compress the frames of the panoramic video and then attempt to segment them, it is possible to segment the frames first and then separately compress each of the series of corresponding segmented regions from successive frames. Thus, compressed video streams can be generated from each of the corresponding segmented regions of the panoramic video frames. In the context of the system and method according to the present invention, if the data files are compressed, then the corresponding data structure will contain compressed data as elements in place of the original uncompressed data.
Transfer to the Viewer and Decoding
Once the panoramic video frames have been segmented, compressed (if desired), and encoded, they are ready for transfer to the viewer. This can be accomplished in a variety of ways, as will now be described.
A standing order system can also be implemented. In such a system, as the user changes viewpoints within the scene being viewed, different segment regions of subsequently transferred frames will be needed to prepare the desired view. In such a standing order system the viewer will request the desired frame segments and these requested segments are sent for each consecutive frame until a new request is sent by the viewer. Once a new request is received the server sends a new set of requested frame segments until a new order is received, and so on, until the entire video has been viewed.
In order to assist the panoramic video viewer in identifying the desired segmented regions of each panoramic video frame that should be transferred in the case of an interactive network connection, an initialization file approach can be employed, as it was in the aforementioned co-pending application entitled “A System and Process for Viewing Panoramic Video”. The viewer described in the co-pending application, needs certain information to play panoramic videos. The initialization file is used to provide the needed information. Essentially, an initialization file associated with a panoramic video is sent by the server to the viewer prior to the viewer playing the video. In one preferred embodiment, this file includes, among other things, pointers or identifiers that indicate how each frame of the panoramic video can be obtained. Thus, in the case of the present invention, the initialization file would be modified to include identifiers that would indicate how to obtain each frame segment of every frame of the panoramic video. Further, the initialization file indicates the order in which the frame segments should be played. The identifiers would uniquely identify each frame segment of the panoramic video and the viewer uses these identifiers to request the desired frame segments. Specifically, the viewer determines which portion of the scene depicted in a frame of the panoramic video that the person viewing the video wishes to see. It then requests only those segments of each panoramic video frame that are needed to provide the desired view to the user, in the frame order indicated in the initialization file.
The foregoing interactive embodiment has the advantages of preserving the bandwidth utilized when sending data from the server to the viewer since only the data actually used by the viewer is transmitted. In addition, the processing and storage requirements of the viewer are minimized, as only those portions of each frame that are needed have to be decoded, decompressed and stored.
Of course, in some circumstances an interactive approach will not be desired or possible. In such cases, the system and process of the present invention can still be advantageously employed. For example, referring to
The panoramic video frame segmentation techniques according to the present invention also have similar advantages when employed with a direct connection between the viewer and some type of storage media (e.g., hard drive, CD, DVD, and the like) where the panoramic video frame segments are stored, since only those portions of the video frame that are needed are read by the o viewer, stored and processed. For instance, referring to FIG. 6 and using the viewer described in the aforementioned co-pending application as an example, the viewer first reads the initialization file from the storage medium (process action 602). In this case the pointers or identifiers provided in the initialization file identify where on the storage medium the viewer can obtain each segment of every frame of the panoramic video being viewed. The viewer next determines which portion of the scene captured in the panoramic video that the person viewing the video wishes to see (process action 604). It then reads and processes only those segments of the current panoramic video frame that are needed to provide the desired view to the user (process action 606). This process is then repeated for each frame of the panoramic video in the order indicated in the initialization file (process action 608).
While the invention has been described in detail by specific reference to preferred embodiments thereof, it is understood that variations and modifications thereof may be made without departing from the true spirit and scope of the invention. For example, the system and method described above is not limited to just frames of a panoramic video. Rather, it could be employed advantageously with the transfer and viewing of any image having a size that exceeds that which will be viewed. In other words, the system and method according to the present invention could apply to any image where only a portion of a scene or image will be viewed on the any one time.
|Cited Patent||Filing date||Publication date||Applicant||Title|
|US6043837 *||May 8, 1997||Mar 28, 2000||Be Here Corporation||Method and apparatus for electronically distributing images from a panoptic camera system|
|US6081551 *||Oct 23, 1996||Jun 27, 2000||Matsushita Electric Industrial Co., Ltd.||Image coding and decoding apparatus and methods thereof|
|US6323858 *||Jun 23, 1999||Nov 27, 2001||Imove Inc.||System for digitally capturing and recording panoramic movies|
|US6337683 *||May 12, 1999||Jan 8, 2002||Imove Inc.||Panoramic movies which simulate movement through multidimensional space|
|US6337708 *||Apr 21, 2000||Jan 8, 2002||Be Here Corporation||Method and apparatus for electronically distributing motion panoramic images|
|US6470378 *||Mar 31, 1999||Oct 22, 2002||Intel Corporation||Dynamic content customization in a clientserver environment|
|US6540681 *||Nov 24, 2000||Apr 1, 2003||U-Systems, Inc.||Extended view ultrasound imaging system|
|1||"Panoramic Image Mosaics", Heung-Yeung Shum, Richard Szeliski, IEEE Computer Graphics and Applications, Mar. 1996.|
|2||*||Altunbasak, "Afast method of reconstructing high-resolution panoramic stills from MPEG-compressed video" IEEE Second Workshop on Multimedia Signal Processing, 1998, pp. 99-104, Dec. 1998.*|
|3||Catadioptric Omnidirectional Camera, Shree Nayar, Proc. Of IEEE Conference on Computer Vision and Pattern Recognition, Puerto Rico, Jun. 1997.|
|Citing Patent||Filing date||Publication date||Applicant||Title|
|US7127116 *||Dec 16, 2003||Oct 24, 2006||Intel Corporation||Image data compression|
|US8878996||Mar 18, 2010||Nov 4, 2014||Motorola Mobility Llc||Selective decoding of an input stream|
|US20040179593 *||Dec 16, 2003||Sep 16, 2004||Goldstein Judith A.||Image data compression|
|US20060011579 *||Jul 7, 2005||Jan 19, 2006||Kei-Yu Ko||Gas compositions|
|US20130041975 *||Aug 10, 2011||Feb 14, 2013||Nathalie Pham||Distributed media access|
|US20140085303 *||Jul 25, 2013||Mar 27, 2014||Tamaggo Inc.||Splitting of elliptical images|
|U.S. Classification||382/235, 375/E07.129, 375/E07.172, 375/E07.132, 348/E05.002, 375/E07.18, 382/240, 375/E07.027, 375/E07.182|
|International Classification||G06K9/36, H04N7/26|
|Cooperative Classification||H04N19/17, H04N19/46, H04N19/44, H04N19/174, H04N19/162, H04N19/102, H04N21/4325, H04N21/43615, H04N21/4143, H04N21/4728|
|European Classification||H04N21/4728, H04N21/432P, H04N21/4143, H04N21/436H, H04N7/26A10S, H04N7/26D, H04N7/26A8R, H04N7/26A8L, H04N7/26A4, H04N7/26A6U|
|Jul 7, 2000||AS||Assignment|
Owner name: MICROSOFT CORPORATION, WASHINGTON
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:UYTTENDAELE, MATTHEW T.;SZELISKI, RICHARD S.;REEL/FRAME:010982/0386
Effective date: 20000703
|Aug 13, 2008||FPAY||Fee payment|
Year of fee payment: 4
|Sep 8, 2009||CC||Certificate of correction|
|Jul 25, 2012||FPAY||Fee payment|
Year of fee payment: 8
|Dec 9, 2014||AS||Assignment|
Owner name: MICROSOFT TECHNOLOGY LICENSING, LLC, WASHINGTON
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MICROSOFT CORPORATION;REEL/FRAME:034541/0001
Effective date: 20141014