US 20060129671 A1
Methods and systems for generic adaptive multimedia content delivery are described. In one embodiment, a novel framework features an abstract content model and an abstract adaptive delivery decision engine. The abstract content model recognizes important aspects of contents while hiding their physical details from other parts of the framework. The decision engine then makes content adaptation plans based on the abstracted model of the contents and needs little knowledge of any physical details of the actual contents. Thus, under the same framework, adaptive delivery of generic contents is possible.
1. A method comprising:
receiving a content request from a content requester;
retrieving the requested content from a content source;
processing the retrieved content to provide an abstract content model comprising a directional graph featuring a top-down hierarchical structure having nodes that represent components of the content and edges that represent relationships between the nodes, the nodes being configured to have a node status that defines dynamic statuses of nodes during content delivery, the node statuses being selected from a group of statuses comprising: (1) inactive status where the node is not yet a deliverable object, (2) activable status wherein an active condition of the node is satisfied but the node is not yet included in a delivery plan, (3) activated status wherein the node has been chosen in a delivery plan, (4) delivered status wherein the node has been delivered successfully to a content receiver, and (5) skipped status wherein the node is not delivered and will not be included in the delivery plan; and wherein there are multiple different types of edges selected from a group of types comprising: (1) a dependency edge type that defines a logical dependency between nodes, (2) a route edge type that defines an ordered or hierarchical dependency between nodes, and (3) a mixed edge type that defines a logical dependency between nodes and an ordered or hierarchical dependency between nodes;
processing the abstract content model to select an optimal delivery plan the use of which will permit requested content to be delivered to the content requester; and
processing the abstract content model to provide deliverable content in accordance with the selected delivery plan.
2. The method of
3. The method of
4. The method of
5. The method of
6. The method of
7. The method of
8. The method of
9. The method of
This is a continuation of and claims priority to U.S. patent application Ser. No. 09/995,499, filed on Nov. 26, 2001, the disclosure of which is incorporated by reference.
This invention relates to methods and systems for adaptive delivery of multimedia contents.
Today we live in a diversified world with a seemingly infinite number of diverse and different sources of on-line information. Typically, this information is accessed via a network such as the Internet. As the Internet and, more generally, computing evolve, people are beginning to become accustomed to and demanding better access to different types of electronically available information.
Against the backdrop of the diverse and different sources of electronic information is the wide range of devices that are connected being used to access the information. For example, people can now typically access network-accessible information using personal computers, handheld computers, personal digital assistants and the like. The situations encountered by individuals attempting to access this diverse collection of information using an ever growing collection of computing devices differs from user session to user session.
For example, some contents may be authored and suitable for use with a certain type of browser. Yet, the contents may not be suitably acceptable for use in other situations. For example, web contents that are authored for use in connection with a browser installed on a personal computer may not be suitable for display on a small handheld device for reasons not the least of which include the size disparity between the different devices' displays.
One possible solution for this problem is to author the same contents so that they reside in different forms that are suitable for all of the different situations that might be encountered. While this is theoretically possible, the solution is practically infeasible due to the time and expense involved.
One area of promise is in the area of so-called adaptive content delivery. One goal of adaptive content delivery is to have content that is readily or easily adaptable to different computing environments.
Early commercial applications focused on providing faster web page downloads for narrow bandwidth connected users (such as dialup and mobile access). Most of the applications accelerated downloads by simply reducing the sizes of embedded image files using aggressive lossy compression schemes. The cost of this solution is lower quality, which is highly undesirable from a customer service standpoint. Some schemes also supported lossless text compression to reduce the transmission time of web pages.
Some companies such as ProxyNet (based on TranSend technology), SpyGlass, and OnlineAnywhere provide proxies or servers that can adjust web pages to fit the display of smaller devices. Their technologies, however, are based on heuristic rules and customized content filters that are designed for specific websites and are used to extract the most important contents from these web pages. Thus, these solutions tend to be rigid and inflexible.
Accordingly, this invention arose out of concerns associated with providing adaptive systems and methods for efficient and flexible content delivery.
Methods and systems for generic adaptive multimedia content delivery are described. In one embodiment, a novel framework features an abstract content model and an abstract adaptive delivery decision engine. The abstract content model recognizes important aspects of contents while hiding their physical details from other parts of the framework. The decision engine then makes content adaptation plans based on the abstracted model of the contents and needs little knowledge of any physical details of the actual contents. Thus, under the same to framework, adaptive delivery of generic contents is possible.
Adaptive content delivery systems and methods are described. Efficiency and flexibility are promoted through a novel solution to generic adaptive multimedia content delivery. Described embodiments are based on an abstract content model that captures important or critical structures and attributes of contents. Contents are modeled as hierarchical directional graphs. Nodes on graphs represent elements of contents. The concept of an “edge” is introduced. Edges define logical relationships between these elements. By finding optimized sub-graphs on these graphs under some constraints, optimized plans for adaptive content delivery can be made. With the help of the abstract content model, optimization procedures for many different types of contents can be standardized. Accordingly different types of contents can be treated equally under this framework.
Exemplary Computer Environment
The various components and functionality described herein can be implemented by various computers.
Generally, various different general purpose or special purpose computing system configurations can be used. Examples of well known computing systems, environments, and/or configurations that may be suitable for use with the invention include, but are not limited to, personal computers, server computers, hand-held or laptop devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
The functionality of the computers is embodied in many cases by computer-executable instructions, such as program modules, that are executed by the computers. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. Tasks might also be performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media.
The instructions and/or program modules are stored at different times in the various computer-readable media that are either part of the computer or that can be read by the computer. Programs are typically distributed, for example, on floppy disks, CD-ROMs, DVD, or some form of communication media such as a modulated signal. From there, they are installed or loaded into the secondary memory of a computer. At execution, they are loaded at least partially into the computer's primary electronic memory. The invention described herein includes these and other various types of computer-readable media when such media contain instructions programs, and/or modules for implementing the steps described below in conjunction with a microprocessor or other data processors. The invention also includes the computer itself when programmed according to the methods and techniques described below.
For purposes of illustration, programs and other executable program components such as the operating system are illustrated herein as discrete blocks, although it is recognized that such programs and components reside at various times in different storage components of the computer, and are executed by the data processor(s) of the computer.
With reference to
Computer 100 typically includes a variety of computer-readable media. Computer-readable media can be any available media that can be accessed by computer 100 and includes both volatile and nonvolatile media, removable and non-removable media. By way of example, and not limitation, computer-readable media may comprise computer storage media and communication media. “Computer storage media” includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules, or other data. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be is accessed by computer 100. Communication media typically embodies computer-readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. The term “modulated data signal” means a signal that has one or more if its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection and wireless media such as acoustic, RF, infrared and other wireless media. Combinations of any of the above should also be included within the scope of computer readable media.
The system memory 130 includes computer storage media in the form of volatile and/or nonvolatile memory such as read only memory (ROM) 131 and random access memory (RAM) 132. A basic input/output system 133 (BIOS), containing the basic routines that help to transfer information between elements within computer 100, such as during start-up, is typically stored in ROM 131. RAM 132 typically contains data and/or program modules that are immediately accessible to and/or presently being operated on by processing unit 120. By way of example, and not limitation,
The computer 100 may also include other removable/non-removable, volatile/nonvolatile computer storage media. By way of example only,
The drives and their associated computer storage media discussed above and illustrated in
The computer may operate in a networked environment using logical connections to one or more remote computers, such as a remote computer 180. The remote computer 180 may be a personal computer, a server, a router, a network PC, a peer device or other common network node, and typically includes many or all of the elements described above relative to computer 100. The logical connections depicted in
When used in a LAN networking environment, the computer 100 is connected to the LAN 171 through a network interface or adapter 170. When used in a WAN networking environment, the computer 100 typically includes a modem 172 or other means for establishing communications over the WAN 173, such as the Internet. The modem 172, which may be internal or external, may be connected to the system bus 121 via the user input interface 160, or other appropriate mechanism. In a networked environment, program modules depicted relative to the computer 100, or portions thereof, may be stored in the remote memory storage device. By way of example, and not limitation,
In the embodiment about to be described, the inventive adaptive content delivery system works as an extended content processor of traditional multimedia content servers. Accordingly, upon receipt of a content request, the server fetches original multimedia contents from a content source and passes them to the adaptive delivery system. Adaptation results are then sent as a response to the request.
As the illustrated and described content host 202 is a front-end between physical contents and other system components, it can be used to manipulate contents based on the abstract content model. In addition, the content host 202 also defines a common set of application program interfaces or APIs for retrieving extended properties of the abstract content model. Exemplary APIs are given at the end of this document. Although the remaining components of the system are content independent, content host 202 itself is dependent on content types. Thus, different media types may desire different implementations of the content host. As a common basis, however, the content host 202 should comprise two sub-modules: content parser 204 and content mapper 206.
In the illustrated and described embodiment, content parser 204 scans input contents and constructs corresponding abstract content model representations a either online or offline. Different formats of the same content and capabilities of supported transcoders can also be abstracted into the same model during this process. Specific examples of how this can be done are given below. Characteristics of one suitable technique for implementing the content parser are described in U.S. patent application Ser. No. 09/893,335, entitled “Function-based Object Model for Use in WebSite Adaptation” filed on Jun. 26, 2001, the disclosure of which is incorporated by reference herein.
Content mapper 206 functions in a manner that is opposite of the way content parser 204 functions. That is, content mapper 206 converts abstract content model representations back to physical contents. Real-time-capable is content transcoders may also be called at this stage to generate desired results.
Decision engine 208 provides functionality for making content adaptation plans. In the illustrated example, decision engine 208 selects appropriate contents that achieve maximum total QoS (i.e. quality of service) values according to current resource constraints and preference factors (as provided by resource model 210 and preference model 212). Based on the abstract content model, this problem is solved by finding optimized sub-graphs of the abstract content model that maximize QoS values under resource constraints. Details of an exemplary content optimization procedure are covered in the section entitled “Content Optimization” below.
Resource and Preference Models
Input parameters, such as network characteristics and client capabilities, are modeled as resources or preference factors. Resources are used as constraints while the decision engine is looking for the best delivering plans. Preference factors are used to alter QoS factors of the abstract content models. For dynamically changing parameters, such as network characteristics, these models should be able to predict future values since the decision engine may use forward-looking algorithms. More information is provided on this topic in the section entitled “A Simple Sub-optimization Algorithm:” below.
Although a caching stage is not explicitly included
Step 250 receives a content request. This step can be implemented responsive to a client device sending such a request. Step 252 retrieves the requested content from a content source and step 254 parses the content and builds an abstract content model. An exemplary abstract content model is described below in more detail. Step 256 processes the abstract content model to select an optimal delivery plan. Examples of how this can be done are described below. Step 258 then processes the abstract content model to provide deliverable content in accordance with the delivery plan. Step 260 then delivers the content to the content requester.
Step 262 receives content. This step can be implemented in any suitable way. For example, this step can be implemented when a server receives content that it is to store for future content requests. Step 264 parses the content and builds an abstract content model. An exemplary abstract content model is described below in more detail. Step 266 processes the abstract content model to select at least one optimal delivery plan. Examples of how this can be done are described below. Step 268 then processes the abstract content model to provide deliverable content in accordance with the delivery plan.
In accordance with the
Exemplary Abstract Content Representation Structure (ACRES)
One of the important goals of the described adaptive content delivery framework is to make the framework a generic content adaptation solution. The decision engine 208 (
In the illustrated and described embodiment, the abstract content model comprises a directional graph that features a top-down hierarchical structure. The hierarchical structure comprises multiple nodes that represent components of the contents, and edges between the nodes represent relationships between these components. In the discussion that follows, definitions of these data models and their basic attributes are given. Then, a discussion of the details of nodes and edges is presented.
For the discussion that follows, the reader is referred to
In the illustrated structure 300, a node is an abstract representation of content or content structure. A Node is represented using a circle or square in the drawings of the data structure. In
Abstract content representation structure 300 comprises a directional graph G=(N, E) that satisfies the following condition (layered constraint), where N and E stand for “node” and “edge” sets of G:
From this definition one can also see that nodes in N, do not have incoming edges and nodes in Nm do not have out going edges. In addition to these definitions, nodes and edges have several basic attributes as listed below.
Node and Edge Attributes
Node status defines the dynamic statuses of nodes during content delivery. In the illustrated and described embodiment the node statuses include the following:
An ignition edge is defined as a dependency edge from a node that is activated, delivered or skipped. In
An active condition defines how the node becomes activable. In the illustrated example, there are three conditions:
An input condition defines when an activable node can be activated. In the present embodiment, except for automatic activable nodes, the only condition is that there exists an incoming route edge from an activated, delivered or skipped node. An output behavior defines how nodes on ends of outgoing route edges can be branched. In the illustrated and described embodiment, a “branch” comprises the operation of changing an activable node to an activated node. In this example, there are three branch operations:
Nodes and Edges
As indicated above, nodes and edges are the basic elements of the described abstract content representation structure. Nodes and edges represent content objects and relationships between these objects. In addition to the basic attributes introduced and discussed above, nodes and edges can have other extended attributes.
As an abstraction of multimedia contents, a node can represent a piece of raw content like a picture, a paragraph of text, a video frame, or a chunk of bits in some coded contents. It can also work as a connection point of content structures where it does not represent any actual contents. In addition to its basic attributes, nodes can have some additional application related attributes.
A value (QoS factor) is defined as an increment to content quality when this node has been delivered. For a node nεN, its value is represented by V(n) or V(n, t) if it is related to time.
A resource factor defines an amount of resources needed to deliver the node. For some types of resources r, one can represent corresponding resource factors of node n by R(n, r).
Edges can represent any kind of relationship between node objects. Some possible edge meanings are listed below:
Some Examples of Using Abstract Content Representation Structures
In the section entitled “ACRES and Adaptation Methods in Detail” below and the related figure, an example of how to use an abstract content representation structure to represent MPEG I video bitstreams together with frame-skipping transcoders for bitrate adaptation is provided.
Using the abstract content representation structure, optimized content can be considered as an optimal sub-graph of the corresponding structure. One target of optimization is to maximize preference-altered total QoS factors of abstract content representation structure nodes covered by the sub-graph under the constraints of the resources. In the section immediately below, some suggestions on choosing proper QoS factors are presented. Then a discussion of a simple bounded search algorithm as a near optimal solution to this content optimization problem will be presented. The algorithm is also used in a verification prototype that is introduced in the section entitled “Adaptive MPEG I Video Streaming” below.
Choosing Proper QoS Factors
The QoS factors of abstract content representation structure nodes play an important role in content optimization. This is because the decision engine 208 (
In many cases, the first principle is easier to follow and conforming to the second principle is usually not trivial. It may not be easy to tell which is more important or meaningful as between two different contents. For example, there is a famous saying that says “a picture is worth a thousand words”. This might be true in some cases, but not in others. In resource critical applications such as mobile communication, text should be more preferable than images most of the time. Thus, it is suggested that QoS definition choices be made on an application specific basis.
A Simple Sub-Optimization Algorithm
In the illustrated and described embodiment, a bounded search algorithm is adopted to find the near optimal solution of the content optimization problem. The pseudo code listed below describes but one optimized adaptive content delivery algorithm. The algorithm is a straightforward implementation of a deep-first search.
The pseudo code starts from a candidate set of activable nodes and then tries to simulate following delivery plans by marking nodes on the path as activated temporarily. A back trace is then used to find other possible delivery plans. Finally, the best starting candidate is selected and delivered. Afterwards, system statuses, such as resources and preferences, are updated. The algorithm is then looped until the end condition is met. It will be appreciated that in some cases, dynamically changing factors, such as network resources, may have to be predicted during the search.
Because there are only limited search depths, the complexity of this algorithm is quite acceptable. However, the algorithm may become complex in some cases. Pruning of search branches is not currently done because node QoS factors may depend on resource consumption of previously selected nodes and thus may change dynamically. Under such a situation, historical records are not reusable and searching cannot be accelerated by pruning. In order to benefit from pruning, modifications can be made. For example, one way to benefit from pruning is to replace continuous values with approximate discrete ones (as time, QoS, resource).
Adaptive MPEG I Video Streaming Example
As a verification and example of the inventive framework, a simple adaptive MPEG I video streaming application was implemented and based on the content optimization algorithm and above-described abstract content representation structure. One goal of this application is to allow smoothed playback of MPEG I video even when transmission bandwidth is less than that which the original video bitstream requires and/or the client-side buffer size is limited.
In the discussion that follows, we start from an analysis of the situation and then we will build an abstract content representation structure of the MPEG I video bit stream that is used in our adaptive delivery verification prototype. After that, a brief introduction will be given to our implementation's architecture. Experimental results are discussed later.
Abstract Content Representation Structures and Adaptation Methods in Detail
Since an adaptable abstract content representation of content is the basis of the inventive approach, this discussion starts by analyzing adaptation schemes of MPEG I video and then constructs the abstract content representation model that enables the adaptation scheme based on the analysis.
As it was designed, MPEG I video bitstreams do not support scalable delivery. Network bandwidth must be large enough to enable smooth playbacks in normal cases. Transcoders are required if network bandwidth is less than the original video requires. However, online transcoding of MPEG video bitstreams is computing intensive and does not suit video streaming applications where multiple contents/connections need to be supported simultaneously. For this reason, a simpler adaptation approach is chosen by selectively replacing/skipping encoded video frames. This approach is very efficient and the video bitrate is reduced at the cost of lower frame rates instead of PSNR losses resulting from normal transcoding.
There are three kinds of encoded video frames in MPEG I video bitstreams—Intra coded, forward Prediction coded and Bi-directional prediction coded. These frame types provide a trade-off between compression efficiency and playback requirements (as seek and error recovery). Several encoded frames form a GOP (group of picture) and temporal references of frames are defined relatively within GOPs. Beside a sequence header that defines essential attributes, video bitstreams are typically a concatenation of GOPs. I frames can be decoded at anytime, but decoding of P and B frames depends on decoded reference frames. In other words, there are relationships that exist as inter-frame dependencies and temporal orders. As a result, skipping I and P frames will cause P and B frames that follow not to be decodable. Skipping B frames has no impact on other frames.
Beside these inter-frame dependencies, frame timing is another issue that is addressed during content adaptation. Although video bitstreams with some skipped frames can be decoded without any problems, the frame timing is changed. As a remedy, escape-coded frames are used instead of skipping where PB frames should be skipped. Thus frame timing is kept unchanged during playback. The escape-coded frames are MPEG I coded frames too and they stand for nothing changed to the reference frame (or one of the reference frames if the frame type is B). According to MPEG I syntax, all macroblocks of a frame must be covered by non-overlapped slices, and a slice must start and end by coded macroblocks. Thus, the minimal escape coded frame must consist of at least two empty coded macroblocks (top-left and bottom-right) and address skip codes for all other macroblocks between them. As a result, the minimal size of an escape coded, CIF frame (P or B) is 32 bytes which is minor, if compared to that of normally coded frames which are at least several kilobytes.
In accordance with the above discussion,
Each node in this model has attributes including data size, expected time to be decoded (TTD) relative to the starting time of delivery, and QoS factor. The QoS factor is defined dynamically according to the current time and TTD. In this example, this value is assigned based on the following heuristics:
In the described experimental system, some content information is also taken into consideration. From the viewpoint of sampling theory, we can see that frames in fast motion sequences should be preserved with higher precedence than those in slow motion sequences during the frame dropping process. It is also known that those frames containing more motion information will normally use more bits than those frames containing less motion information when they are predictive-coded. As a result, the QoS factor of node n is defined as V(n, t)=U(t)*coded_size(n). However, its effects are very limited when video bitstreams are CBR coded.
In this described example, the prototype was implemented as a WWW service extension to MS Internet Information Server running on MS Windows NT. The adaptation application runs as an ISAPI extension on IIS. Video data is processed and streamed in real-time from an original source through a standard HTTP protocol stack provided by IIS. A bandwidth-limitation software pipe was used as a simple emulation of network bandwidth. Adapted video data are firstly sent through this pipe before IIS sends it out. Parameters such as emulation bandwidth and optimizer search steps are all sent to the server as request parameters. Several popular client applications that support playback of MPEG I video have been successfully tested using HTTP streaming including Windows Media Player and QuickTime Player.
We tested the implementation using both CBR and VBR bitstreams.
From these results one can see that bandwidths of the delivered streams are successfully reduced and smoothed. Thus, playback of these video bitstreams is possible even when the network bandwidth is far narrower than that the source bitstreams demand and when the client side buffer size is limited. From
Exemplary Application Programming Interfaces
Appearing below are a collection of exemplary application programming interfaces (APIs) that can be utilized to implement embodiments of the system described above.
Methods and systems that provide a framework for generic adaptive multimedia content delivery have been described. The framework features an abstract content model and an abstract adaptation decision engine that can make adaptive delivery plans without knowing much of the physical details of actual content. The capabilities of the framework have been demonstrated with an application of adaptive video streaming. Experimental results further show that the proposed framework is effective and efficient in adaptive delivery of contents under variable network conditions. The described architecture can be easily extended to have much stronger capabilities.
Although the invention has been described in language specific to structural features and/or methodological steps, it is to be understood that the invention defined in the appended claims is not necessarily limited to the specific features or steps described. Rather, the specific features and steps are disclosed as preferred forms of implementing the claimed invention.