US 20060064641 A1
The present invention is concerned with client-side production in a personal computer environment of low bandwidth images and audio. A series of low bandwidth still images along with a “script” and audio data is sent over a network in a client/server architecture or is read from a compact disk or other memory. A “director” module residing in a client personal computer uses the “script” to tell the computer how to execute a sequence of “moves” on the still images. These moves include cuts, dissolves, fades, wipes, focuses, flying planes and digital video effects such as push and pull. Moves within a still image occur in real time, and are relatively smooth and continuous as compared to prior art network video. Low bandwidth is achieved because most of the production is done at the client location without relying upon slow, bandwidth-limited downloading of conventional network video formats.
32. A system for data communications between a server and a client computer that are in communication with each other via a computer network to thereby display a digital video production on the client computer without the digital video production itself being sent over the network from the server to the client computer, the system comprising:
a client computer in communication with a computer network;
a server in communication with the computer network;
wherein the client computer is configured to receive a plurality of image files and a script module from the server over the computer network, the image files defining a plurality of images that are to be displayed as part of a digital video production, the script module defining how the plurality of image files are to be processed to generate the digital video production; and
wherein the client computer is configured to execute a software module that is resident thereon, the software module being configured to generate the digital video production for display on the client computer from the received image files and the received script module.
33. The system of
34. The system of
35. The system of
36. The system of
37. The system of
38. The system of
39. The system of
40. The system of
41. The system of
42. The system of
43. A method comprising:
generating a video sequence locally at a client computer from (1) data received by the client computer from a remote server over a computer network that represents a plurality of still images and (2) data received by the client computer from a remote server over a computer network that represents instructions defining how the still images are to be sequenced together to generate the video sequence; and
displaying the generated video sequence on the client computer.
44. The method of
storing a software module locally on the client computer, the software module being configured to (1) process the received instructions data and (2) generate the video sequence from the received still images data in accordance with the processed instructions data.
45. The method of
receiving, at the client computer, the instructions data and a preliminary portion of the still images data from a remote server over a computer network; and
wherein the generating step comprises generating a preliminary portion of the video sequence from the received instructions data and the received preliminary portion of the still images data; and
wherein the displaying step comprises displaying the preliminary portion of the video sequence once it has been generated.
46. The method of
subsequently receiving a subsequent portion of the still images data from a remote server over a computer network; and
wherein the generating step further comprises generating a subsequent portion of the video sequence from the received instructions data and the received subsequent portion of the still images data; and
wherein the displaying step further comprises displaying the subsequent portion of the video sequence once it has been generated and after the preliminary video sequence portion has been displayed.
47. The method of
48. The method of
receiving data that represents the audio portion of the video sequence from a remote server over a computer network, wherein the instructions data further defines how the received audio data is to be incorporated into the video sequence, and wherein the generating step further comprises incorporating the received audio data into the video sequence in accordance with the processed instructions data.
49. The method of
connecting the client computer to an Internet website;
displaying a page of the Internet website on the client computer; and
initiating the receiving steps, the generating step, and the displaying step in response to selection of a link on the displayed page by a user of the client computer.
50. A system comprising:
a client computer in communication with a computer network;
a server in communication with the computer network;
wherein the server is configured to communicate, over the computer network, a partial video production and a plurality of instructions to the client computer, the instructions defining how a software program resident on the client computer can generate a full video production from the partial video production; and
wherein the client computer is configured to (1) receive the partial video production and the instructions, (2) execute a software program resident thereon to generate the full video production from the received partial video production in accordance with the received instructions, and (3) display the full video production thereon.
This application is a continuation application of U.S. application Ser. No. 10/020,104, filed on Dec. 12, 2001, and entitled LOW BANDWIDTH TELEVISION, now U.S. Pat. No. ______, which is a divisional of U.S. application Ser. No. 09/233,687 filed Jan. 19, 1999, and entitled LOW BANDWIDTH TELEVISION, now U.S. Pat. No. 6,380,950, which claims priority to provisional application Ser. No. 60/071,930 filed Jan. 20, 1998, the entire disclosures of each of which are incorporated herein by reference.
1. Field of the Invention
The present invention relates generally to image processing, and specifically to production of images and audio in a personal computer environment.
2. Discussion of the Prior Art
An important issue in digital technology is providing video images on a personal computer. These images are transmitted across the Internet and other networks, across telephone lines with modem-to-to modem connections, or received from compact disk read-only memories (CD-ROMs). The speed of a modem is commonly the limiting factor in sending real time, continuous video information across the Internet, over corporate intranets or local area networks. In comparison, continuous network transmission of audio data does not present significant difficulties.
Table 1 shows theoretical bandwidth maxima for various network architectures. Modem-to-modem connections across lines in plain old telephone service (POTS) have a theoretical bandwidth of 3,360 bytes per second, while connections across the Internet with a modem or single ISDN are limited to 5,600 bytes per second. Dual ISDN network architectures transmit a maximum of 11,200 bytes per second, while corporate local area networks with 10BaseT connections have a capability of transmitting one megabyte per second. With the exception of telephone line connections, these other techniques involve non-continuous, packet-switched data. Satellite and cable architectures are also possible, but have not yet been widely adopted and present other difficulties.
On the other hand, computer memories and processor speeds have made rapid advances. Personal computers have hard drives accommodating many gigabits of data, and the price of memory chips is decreasing. Processor speeds approaching 300 MHz are available, and speeds of several GHz are contemplated.
To view a still or motion picture from the Internet on a personal computer, a user conventionally downloads video data from a web site by clicking on a web link. Often, however, it is necessary to separately download (or otherwise obtain) software, e.g. Adobe Acrobat, in order to display a particular image format. Images are frequently compressed for transmission over networks or storage on disks. Compression algorithms, such as JPEG and MPEG, using discrete cosine transfer (DCT) methods, produce serviceable images but compromise image size, image quality, definition, and acquisition speed. Image latency is also sacrificed. A user must wait while an entire image or series of images is buffered in a client side personal computer prior to display. Image transmission is sometimes interrupted due to network errors and traffic. Streaming techniques allow a user to begin viewing the images immediately while downloading, but streaming still sacrifices image quality and latency.
Currently, International Telecommunications Union Standard ITU-R 601 for digital formats in professional video production (i.e. NTSC) requires 720 by 486 pixels per frame in the scanned image, and an eight-bit 4:2:2 sampling of Y, R-Y, B-Y color components at sixty frames per second. This results in a data stream of 20 megabytes per second if the format is to remain uncompressed and if the images are to be viewed continuously in real time. Clearly, this is greater than the fastest rate for 10BaseT of one megabyte per second. A compression ratio of 5:1 is the most that is considered desirable for production marketplace image quality, but this only reduces the necessary data rate to 4 megabytes per second. Using 4:1:1 sampling, other conventional digital video production techniques (e.g. DVC Pro and DV Cam) produce a marginally improved data rate of 3 megabytes per second. Compression ratios of 30:1 are sometimes used for previewing and editing of video images, but this only yields a data rate of 700 kilobytes per second. Data rates for these formats are summarized in Table 2.
Comparing this to the standard modem of 56 kilobytes per second, there is a readily apparent, significant gap between requirements for ITU-R 601 and present-day hardware transmission capabilities. A further compression ratio of 125:1 on an already-compressed and marginally acceptable 30:1 compressed image, i.e. a total compression of 750:1, is needed to transmit ITU-R 601 data across a 56 k modem.
Present methods of displaying moving objects on web pages involve either bit-mapped or vector approaches. Simple moving icons on a web page are produced by changing only part of the image in every frame. For example, Microsoft® and Netscape® browsers show moving traces around their logos while a processor is retrieving a page. Advertisements on web pages also display moving images. The bandwidth for these images is reduced by making the images smaller so that fewer bits are needed for each frame, or by slowing down the frame rate so that the images appear to move discontinuously.
High definition television (HDTV) attempts to simplify the display of video images and reduce bandwidth by recognizing constant areas within a video picture and retaining much of the information from a previous frame. While HDTV developed concurrently with MPEG and JPEG, HDTV is broadcast-oriented and does not lend itself to network transmission or personal computer applications.
It is expected that bandwidth will continue to be the bottleneck in network transmission for the foreseeable future. Thus, there is an outstanding need in the prior art to be able to send professional quality video images across networks through ordinary modems by taking advantage of plenary memory and processor capacities within personal computers, and thereby reducing reliance on transmission hardware. There is also a need to create compelling new video experiences in personal computers.
The present invention is concerned with client-side production in a personal computer environment of low bandwidth images and audio. A series of still images in an image module along with a “script” module and an audio module are sent over a network in a client/server architecture or are read from a compact disk or other memory. A “director” module residing in memory (e.g. on hard disk) of the client personal computer uses the “script” to tell the computer how to execute a sequence of “moves” on the still images. These moves include, but are not limited to, cuts, dissolves, fades, wipes, focuses, flying image planes, and digital video effects such as push and pull. The director module is either downloaded from a network on a one-time basis or uploaded from a floppy or compact disk.
Production sequences are in real time, as well as being relatively smooth and continuous as compared to prior art network video. In order to permit viewing as soon as possible and to avoid caching, the script module is transmitted to the personal computer along with preliminary images, so playback begins immediately. Low bandwidth is achieved because a majority of the production is done at the client location and the transmission of still pictures, audio data and script is relatively rapid. Images are always displayed in real time and in full screen formats. If necessary to prevent latency delays, the director modules inserts stand-in from stock footage, animation and loops so that a viewer always has a continuous visual and audio experience.
For a more complete understanding of the invention, as well as other features thereof, reference may be had to the following detailed description of the invention in conjunction with the drawings wherein:
FIGS. 4(a) and (b) illustrate software modules for producing visual and audio sequences; and
FIGS. 2(a) to 2(d) show selected “moves” characteristic of the low bandwidth television of the present invention.
Another application is a moving banner. The banner is stored as a bitmapped still picture 210 (
Low bandwidth television produces a sequence of moves on still bit-mapped images specified by an accompanying script. The production sequence can be rapidly and consecutively strobed and repeated in a particular order, or the sequence can be strobed and repeated in a different order. Repetition and looping of sequences implies that any production sequence has an arbitrarily long and potentially infinite duration. A production sequence may consist of combinations of still images, high resolution photographs, text graphics, high resolution text, and animated computer graphics. While the present embodiment contemplates that the director primarily operates on still images, short video clips residing as stock footage with the director module may optionally be utilized.
Low bandwidth television assumes full screen and real time display of images. In contrast to prior art systems where image size must be scaled and quality reduced to conserve bandwidth, the present invention improves the viewing experience by requiring that images cover the entire screen of the personal computer. Larger size and higher resolution pictures are possible because a majority of the production work is contemporaneously performed by the director at the client location rather than prior to network transmission. Real time display is achieved because the image and audio modules are transmitted quickly across the network due to their small bandwidth. The director further guarantees a real time experience by inserting stock footage, looping and stretching whenever image data is delayed due to network latency.
Each image module is generally synchronized with an audio track that is sent with the script. The audio track optionally includes music tracks, Foley effects, and voiceovers. An audio engine has a capability of mixing multiple audio tracks and adding special audio effects such as reverb and audio delays in real-time. The director module includes a high quality audio synthesizer having a file size of about 20 megabytes.
One major difference between the low bandwidth production system of the present invention and prior art video production systems is the degree to which a finished product is sent over a network or stored on a disk. Prior art Internet video devices (e.g. MPEG) send a finished product over the network, while the present invention sends only a partial product and a script and then finishes the video production at the client station with the director. Much greater bandwidth is required for the prior systems of sending a finished series of images over the network than it does to send a partially completed set of images with a script describing how the images are to be animated, and then finishing the animation of the images at the client computer. In a disk storage environment, much more disk space is required to store all of the pixels of a series of images than to store one image and script code representing how the images are to be animated.
Furthermore, the video production method of the present invention is much faster than prior art methods despite the reassembly time for still image production at the client. The speed of the prior art method of downloading video images from the Internet is limited by a bottleneck at the modem. By contrast, while the video production of the present invention is uncompleted at the time it arrives at the client computer, the processor reconstructs the production from the images and the script much more quickly than the delay occasioned at the modem.
LBTV has a number of advantageous characteristics. It uses the same audio and visual language of film and video production standards. Smooth and continuous motion is produced in real-time as compared to standard methods of viewing images from networks. There is no image latency because the image stills and script are transmitted rapidly in comparatively small files. Moving images are displayed in real-time because the director quickly calculates the production sequence at the client computer from the stills and script. The images are displayed at sixty fields per second (in NTSC) with anti-aliased graphics, high-resolution imagery, full-screen displays and high-quality audio. These capabilities are realized because the majority of the work is done by exploiting the processor and memory at the client computer.
Although digitized video clips may be used with LBTV, their large bandwidth implies that they are utilized sparingly. However, clip bandwidth can be decreased with keys to reduce their size, or with other special effects such as strobing or posterization. In strobing, every fifth video frame is displayed and frozen. Stock footage stored at the client computer may also be used since it requires no network transmission time.
The present invention also provides stand-in and loops to permit an immediate and continuous viewing experience without caching. Initially, only the script module and the earliest part of the image data from the image module are sent across the network. Thus, presentation of images and sound begins immediately for the viewer without downloading of the entire image file. Neither is it necessary for the image and sound data to cache in the client computer memory. To prevent latency problems, the director inserts stock footage as stand-ins or causes the images already received to loop or stretch in the production sequence. Therefore, in contrast to prior art systems where the visual stream is interrupted or the viewer must wait while the images are downloaded, the director ensures a continuous viewing experience.
The present invention is also applicable to receiving a production module comprising a script module, an image module, and an audio module, from a disk drive, e.g. a CD-ROM, rather than obtaining this module over a network. While digital video disks (DVDs) provide for real time viewing at approximately sixty frames per second, low bandwidth production techniques further increase the number and run time of programs that can be stored on a single DVD. Moreover, LBTV does this without data compression.
A particular video production begins in step 320. The production module includes an image module, an audio module and a script module. Initially, only the script module and first viewing parts of the image and audio modules are transmitted over the network so that viewing begins immediately without caching. Viewing is initiated either by clicking on a link in a web site and receiving transmitted data from a network (e.g. the Internet) via a server, or by reading from a disk drive, for example, a magnetic disk or a CD-ROM.
The director module uses the script module to generate initial video and audio sequences from the image module and the audio module (step 330). The video and audio sequences are played on the video screen and through stereo speakers of the personal computer (step 340). Meanwhile, more data from the image module and audio module are loaded across the network into the client computer (step 370). The director module continues to work on the newly received data from the image and audio modules with cues from the script module to generate new visual and audio sequences.
If there is a gap at any time in the production due to latency or data transfer problems (step 345), the director maintains a continuous real time presentation by inserting stock footage or providing looping (step 360). As long as there is more data being received from the network (step 365), the director continues to load data from the image and audio modules (step 370). When program data transmission is complete, a user may return to play another video and audio sequence (step 380), or terminate the program (step 390).
FIGS. 4(a) and (b) illustrate software components of the various modules of the present invention. The plug-in comprises director module 410, which includes full screen transition algorithms 420 and partial screen effects algorithms 430 (
Production module 470 includes script module 475 with commands in an edit decision list (EDL), image module 480 having bit-mapped images of the still pictures utilized in the production, including photographs 481, graphic images 482, and short video clips 483 (