US 7176957 B2
A multi-participant videoconference system incorporating a back-channel connection and a client video mixer is disclosed. The multi-participant videoconference system includes a client component and a server component. The server component provides a composite conference video signal to the client component. A region is defined in the composite conference video signal and the size and coordinates of the region are communicated to the client component by the server component over the back-channel. The client component captures local video and mixes local video into the composite conference video signal using the size and coordinates received from the server component for display.
1. A multi-participant videoconference system, comprising:
a client component including a conference client that is enabled to execute videoconferencing software;
a conference channel;
a server component enabled to provide a composite video signal to the client component over the conference channel, the composite video signal containing a video signal received from at least one other client component in the multi-participant video conference system; and
wherein the client component further comprises a client video mixer configured to mix a local video signal into the composite video signal received from the server component and render video images including a local video image contained in the composite video signal according to position and layout information communicated between the server component and the client component over the back-channel.
2. The multi-participant videoconference system of
3. The multi-participant videoconference system of
4. The multi-participant videoconference system of
5. The multi-participant videoconference system of
6. A method for defining a region for local video within a conference video layout on a display of a client in a multi-participant videoconference system, which also includes a server, the method comprising the steps of:
defining a conference video layout for the client's display, the conference video layout having a plurality of regions, at least one of which is for externally captured video carried in a composite conference video signal received from the server;
identifying a region in the defined conference video layout for display of local video that is captured by the client and processed only by the client;
communicating a size and a location of the identified region for display of the local video to the client;
transmitting the composite conference video signal to the client, the composite conference video signal including the identified region for display of the local video; and
mixing a local video signal containing the local video into the composite conference video signal in accordance with the identified region.
7. The method of
communicating a size and a location of any of the at least one region for externally captured video that at least partially obscures the identified region for display of the local video to the client.
8. The method of
9. The method of
10. The method of
11. The method of
12. The method of
13. The method of
14. A client component for use in a multi-participant videoconference system, the client component comprising:
a video capture device configured to capture local video and generate a local video signal;
a conference channel port through which the client component is configured to communicate primary conference information with an external component, the primary conference information including a composite video signal transmitted to the client component from the external component;
a back-channel port through which the client component is configured to communicate secondary conference information with the external component, the secondary conference information including instructions received by the client component from the external component regarding placement of the local video relative to the composite video on a display associated with the client component; and
a mixer configured to mix the local video signal into the composite video according to the received instructions.
15. The client component of
16. The client component of
17. The client component of
18. The client component of
19. A method for improving image quality of a composite conference video in a multi-participant videoconference system, comprising:
defining a layout for a plurality of video images;
identifying a region of the layout to be replaced by a locally generated video image;
composing a composite conference video signal configured to communicate and display the composite conference video in the defined layout, the composite conference video signal including the identified region;
transmitting the composite conference video signal from a server component to a client component of the multi-participant videoconference system; and
displaying the composite conference video with the identified region replaced by the locally generated video image.
20. The method of
21. The method of
1. Field of the Invention
The present invention relates generally to multi-participant conferencing systems, and more particularly to utilization of local video in an integrated multi-participant conferencing system having a back-channel connection.
2. Description of the Related Art
Conferencing systems generally define the ways in which a set of participants may collaborate. The structures of conferencing systems typically establish the rules for information exchange. Methods of communication are identified and defined, and accessible media is identified within the conferencing system structure. Some systems may allow for communication with remote or mobile participants, while others limit access to dedicated locations or members. Features such as grouping participants into logical meeting spaces, assigning security access rights to a collaboration, specializing in particular types of media communication, and so forth, are all elements of various conferencing systems.
Participants in a videoconference typically receive media from several contributing sources in a collaborative exchange. Such media includes, but is not limited to, video, audio, slide shows, etc., and can be offered by either or both the individual participants in the video conference as well as the conferencing system itself. Looking at just the video portion of the media, participants, whether or not the participant contributes media to the conferencing system, typically receive a single video signal representing a view of the conference.
Depending on the conferencing system used, the conference view displayed to each participant may be divided into regions according to a video layout or presentation chosen by the individual participant or by the conferencing system. Each region displayed may represent participants in the conference, or a single region representing a primary speaker may be selected or defined, or any number of variations of video presentation according to the level of sophistication or complexity of the particular conferencing system and according to preferences of conference participants.
It is often advantageous for a contributing participant in a videoconference to have a view or display of the video that the contributing participant is providing to the collaboration. Such a view or display provides feedback to the contributing participant, allowing the contributing participant to ensure that what is being provided is accurate, desirable, and is conveying the message or view intended (i.e., the subject is in frame, the subject has correct focus and exposure, etc.) In the typical conferencing system, the contributing participant's own video is simply another media in the conference. Consequently, a region in a video presentation or layout may be identified to contain the view of the contributing participant's own video. The result is that the contributing participant sees him or herself in one of the defined regions of the video presentation or layout.
Typically, all video in the videoconference, including the “self video” described above, is sent from a participant using a conference client of the conferencing system to a media mixer of the conferencing system. The conferencing system then returns a video signal according to the participants' selections, including, if selected, the contributing participant's self video. As with many communication systems, the process as described introduces latency through transmission delays and processing time. Additionally, video quality can be diminished as the video signal may be operated on by encoders, decoders, mixers, processors, etc., and is affected by transmission limitations including bandwidth or signal loss. The resulting self video, therefore, is typically degraded, distracting, and generally undesirable.
Some prior art approaches to achieving acceptable self or local video include the providing of local video through a separate loopback technique. In this approach, video from a local camera device is both forwarded to the conferencing system for mixing with all other video, and is displayed directly on the conference client to the contributing participant. Common techniques for the local presentation include display of a separate video window, display in the video presentation or layout as a Picture-In Picture over or within the system video presentation or layout, and the use of a separate local video display. While such techniques benefit the contributing participant by enabling essentially real-time video display, drawbacks include possibly needing more display area, increased window management, more equipment may be required, and the need to monitor more than one display.
In view of the foregoing, what is needed is a videoconferencing system that implements a local video loopback method providing useful video presentation or layout and region interfaces, enhancing the videoconference environment, and that is easily implemented.
Broadly speaking, the present invention fills these needs by providing a multi-participant videoconferencing system having a back-channel communication link and a client video mixer to integrate local video into a server-provided composite conference video feed. The present invention can be implemented in numerous ways, including as a process, an apparatus, a system, a device, a method, or a computer readable media. Several embodiments of the present invention are described below.
In one embodiment, a multi-participant videoconference system is provided. The multi-participant videoconference system includes a client component. The client component includes a conference client enabled to execute peer-to-peer videoconferencing software. The conference client communicates video and audio data across a conference channel. The client component further includes a client monitor configured to monitor the conference client, and a back-channel connection. The back-channel connection is a parallel communication link to the conference channel between the client component and a server component. The client component also includes a client video mixer. The multi-participant videoconference system further includes a server component. The server component provides a client configurable audio/video stream to each of a plurality of participants in the multi-participant video conference system.
In another embodiment, a method for defining a region for local video within a composite video layout of a multi-participant videoconference system is provided. The method includes defining a composite conference video layout. The composite conference video layout has up to a plurality of regions. The method also includes identifying a region in the defined composite conference video layout for a local video display. The method then provides for communicating a size and a location of the identified region to a client component of the multi-participant videoconference system, and for transmitting a composite conference video signal to the client component. The composite conference video signal includes the identified region for the local video display.
In a further embodiment, multi-participant video conference system is provided. The multi-participant videoconference system includes a server component having a media mixer, and a client component having a client video mixer. The client video mixer is capable of inserting real-time video content into a conference composite video signal within a specified region defined by the server component.
In yet another embodiment, a method for improving an image quality of a composite conference video in a multi-participant videoconference system is provided. The method includes defining a layout for up to a plurality of video images. The layout defines a composite of the up to a plurality of video images. A region of the layout is identified. The identified region is to be replaced by another video signal. The method further provides for composing a composite conference video signal. The composite conference video signal is configured to communicate the composite conference video in the defined layout. The composite conference video signal includes the identified region. The method then includes transmitting the composite conference video signal from a server component to a client component of the multi-participant videoconference system. The identified region of the composite conference video minimizes processing of one of the up to a plurality of video images in the composite conference video signal.
The advantages of the present invention over the prior art are numerous and will become apparent from the following detailed description, taken in conjunction with the accompanying drawings, illustrating by way of example the principles of the invention.
The accompanying drawings, which are incorporated in and constitute part of this specification, illustrate exemplary embodiments of the invention and together with the description serve to explain the principles of the invention.
An invention for a local video loopback method in a multi-participant videoconferencing system is described. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. It will be understood, however, to one skilled in the art, that the present invention may be practiced without some or all of these specific details. In other instances, well known process operations have not been described in detail in order not to unnecessarily obscure the present invention.
Embodiments of the present invention provide a method and system to mix local video, i.e., video captured by a client device, with composite conference video, i.e., video from a videoconference server presenting a mix of participant video, according to position and layout information provided by either the videoconference client or the videoconference server. As used herein, media includes any suitable type of information or data encountered within a videoconference environment, e.g., video/audio streams, raster/vector images, documents, annotations, POWERPOINT presentation images, etc. Embodiments of the present invention may be implemented in a multi-participant videoconference system. An exemplary multi-participant videoconference system is generally described below, and is further described in detail in U.S. patent application Ser. No. 10/192,080 filed on Jul. 10, 2002, and entitled “Multi-Participant Conference System With Controllable Content Delivery Using a Client Monitor Back-Channel” which is hereby incorporated by reference for all purposes.
Multi-Participant Videoconferencing System
As an overview, a general description of a multi-participant videoconferencing system in which embodiments of the present invention can be implemented is provided. The following description is intended to be exemplary of a multi-participant videoconferencing system and environment, and should not be construed as limiting or exclusive. Embodiments of the present invention can be implemented in a plurality of videoconferencing systems, and may serve to enhance or improve the videoconferencing experience with a system less capable or simply having different features than the exemplary system described.
An exemplary multi-participant videoconferencing system environment includes a server-side multi-point control unit (MCU) to enable multi-participant features while connecting clients having pre-existing peer-to-peer videoconferencing software. The multi-participant videoconferencing system includes a parallel connection to the conference channel to enable functionality through a client monitor that monitors a participant's interactions with the videoconferencing system.
The exemplary multi-participant videoconferencing system includes a client component and a server component. The client component includes a conference client and a client monitor. Generally, the conference client is a peer-to-peer videoconferencing application. An example of a peer-to-peer videoconferencing application is MICROSOFT'S™ NETMEETING® application, but many peer-to-peer videoconferencing applications exist and may be suitable according to a particular videoconferencing system. The client monitor captures input from the conference client. In addition, the client monitor may incorporate a graphical user interface (GUI) in which the video window of the peer-to-peer application is a component.
The client monitor provides the captured input from the conference client to the server component. The captured input is transmitted to the server component through a separate connection, i.e., a back-channel connection, that operates in parallel with the conference channel for each conference client. A back-channel connection system enables the server to dynamically modify the GUI being presented to a participant based on the captured input provided to the server component. For example, the client monitor can capture events, such as mouse clicks or other mouse and/or keyboard activity, executed by a user when the mouse pointer is within a region of the conference client that displays the video signal. The events are transmitted through the back-channel connection to the server component for interpretation. In this manner, the back-channel connection allows for active regions and user interface objects within the video stream to be used to effect functionality and content.
Client monitor (CM) 114 monitors conference client 112. CM 114 a is configured to monitor conference client A 112 a. That is, CM 114 a looks at how a participant 110 is interacting with the software application by monitoring, for example, a video display window of conference client A 112 a. In addition, CM 114 a interprets the participant's 110 interactions in order to transmit the interactions to the server component. In one embodiment, CM 114 is configured to provide four functions. One function monitors the start/stop of a conference channel so that a back-channel communication session can be established in parallel to a conference channel session between the participant and the server component. A second function monitors events, such as participant 110 interactions and mouse activity, within the video window displayed by conference client 112. A third function handles control message information between the CM 114 and a back-channel controller 126 located in the server component. A fourth function provides an external user-interface for the participant 110 that can be used to display and send images to other conference participants 110, show the other connected participant names, and other communication information or tools.
As mentioned above, client monitor 114 monitors activity in conference client 112. For example, this may include monitoring participant events over the video display region containing the conference content, and also may include the conference session control information. For example, CM 114 monitors the start and end of a conference session or a call from the conference client. When conference client 112 places a call to MCU 120 to start a new conference session, CM 114 also places a call to the MCU 120. The call from CM 114 establishes back-channel connection 118 for the participant's conference session. Since CM 114 can monitor the session start, stop, and other events, back-channel connection 118 initiates automatically without additional user setup, i.e., the back-channel connection 118 is transparent to a user. Accordingly, a new session is maintained in parallel with conference client 112 activity. It should be appreciated that conference channel 116 provides a video/audio connection between conference client 112 and connection manager 122 of MCU 120. In one embodiment, conference channel 116 provides a communication link for real time video/audio data of the conference session communicated between the client component and the server component.
In one embodiment, CM 114 specifically monitors activity that occurs over the conference's video frame displayed by conference client 112. For example, CM 114 may monitor the video image in MICROSOFT'S™ NETMEETING® application. Mouse activity in the client frame is relayed via protocol across back-channel connection 118 to MCU 120. In turn, back-channel controller 126 can report this activity to another participant, or event handler 124 for the respective participant. In this embodiment, the monitoring of conference client 112 application occurs through a hook between the operating system level and the application level. As described below, the video window can be monitored for mouse clicks or keyboard strokes from outside of the videoconferencing application.
In another embodiment, CM 114 can present a separate user-interface to the participant. This interface can be shown in parallel to the user interface presented by conference client 112 and may remain displayed throughout the established conference. Alternatively, the user interface presented by CM 114 may be presented before or after a conference session for other configuration or setup purposes.
In yet another embodiment, CM 114 may provide an interface for direct connection to a communication session hosted by MCU 120 without need for a conference client. In this embodiment, CM 114 presents a user interface that allows back-channel connection 118 to be utilized to return meeting summary content, current meeting status, participant information, shared data content, or even live conference audio. This might occur, for instance, if a participant has chosen not to use conference client 112 because the participant only wishes to monitor the activities of the communication.
CM 114 a is configured to recognize when the videoconference application of conference client A 112 a starts and stops running, in turn, CM 114 a can start and stop running as the conference client does. CM 114 a can also receive information from MCU 120 in parallel with the videoconference session. For example, CM 114 a may allow participant A 110 a to share an image during the conference session. Accordingly, the shared image may be provided to each of the client monitors so that each participant is enabled to view the image over a document viewer rather than through the video display region of the videoconference software. As a result, the participants can view a much clearer image of the shared document. In one embodiment, a document shared in a conference is available for viewing by each of the clients.
The server component includes MCU 120 which is configured to deliver participant customizable information. It should be appreciated that MCU 120 and the components thereof are software code configured to execute functionality as described herein. In one embodiment, MCU 120 is a component of a hardware based server implementing the various features described herein and elsewhere. MCU 120 includes media mixer 130, back-channel controller 126, and event handler 124. MCU 120 also provides connection manager 122 and session manager 128.
In one embodiment, MCU 120 functionality is enabled by providing for connections of separate participants into selectable logical rooms for shared conference communications. MCU 120 acts as a “peer” to a conference client, but can also receive calls from multiple participants. One skilled in the art will appreciate that MCU 120 internally links all the participants of the same logical room, defining a multi-participant conference session for each room. Each peer-to-peer conference client 112 operates with the MCU 120 only as a peer. Acting as a peer endpoint for each of participants 110, connection manager 122 is where all media enters and exits the MCU 120 for a given participant 110. Participants 110 equipped with a back-channel connection 118 connect with connection manager 122 for resolution of events, through the event handler 124. In one embodiment, MCU 120 is configured to conform to the peer requirements of conference client 112. For example, if the conference clients 112 are using H.323 compliant conference protocols, as found in applications like MICROSOFT'S™ NETMEETING®, MCU 120 must also support the H.323 protocol. The conference communication, in various embodiments of the exemplary multi-participant videoconference system 100 can occur via H.323 protocols, Session Initiated Protocols (SIP), or other suitable APIs that match the participant connection requirements. Conference communication by any protocols or APIs as described above is usually over conference channel 116.
Event handler 124 monitors each participant's 110 activity and provides input to the media mixer 130 to configure a media layout or presentation. Session Manager 128 defines the rules that govern each type of conference collaboration and controls both system and participant media exchange behaviors accordingly. Session manager 128 can limit the content available to participants 110 for their manipulation or control. Session manager 128 can also define the roles of a set of one or more participants 110 and offer functions appropriate to their roles. By way of example, session manager 128 may define presentation rules that favor control of a conference by a speaker participant 110 over audience participants 110. When an audience participant 110 has a question, the rules may dictate that the speaker participant 110 must signal the system to call upon the audience participant 110 with the question, allow the audio of the audience participant 110 to pass through the system, and then return control to the speaker participant 110 for comments. In defining panel discussion rules, session manager 128 may define a small set of participants 110 to constitute the “primary” participants 110, while other participants 110 attend in an essentially observation mode only. Session manager 128 functions include controlling content and activity based upon the collaboration model.
Media mixer 130 is configured to assemble audio and video information specific to each participant 110 from the combination of all participants' 110 audio and video, the specific participant 110 configuration information, and server user-interface settings. Media mixer 130 performs multiplexing work by combining incoming data streams, i.e., audio/video streams, on a per participant 110 basis. Media mixer 130 includes a video layout processor and an audio distribution processor to assemble the conference signals. Media mixer 130 receives instruction from a number of sources including event handler 124, and session manager 128, to control the layout and content of media delivery for each participant 110.
The client monitor-back-channel network allows MCU 120 to monitor a participant's interactions with conference client 112 and to provide the appearance that the peer-to-peer software application has additional functionality. The additional functionality adapts the peer-to-peer functionality of the software application, executed by conference client 112, for the multi-participant environment described herein. The client monitor back-channel network includes client monitor 114 back-channel connection 118, back-channel controller 126, and event handler 124.
Back-channel connection 118 is analogous to a parallel conference in addition to conference channel 116. Back-channel controller (BCC) 126 maintains the communication link from each client monitor. Protocols defined on the link are interpreted at MCU 120 and passed to the appropriate destinations, i.e., BCC 126 for other participants, event handler 124, or back to the CM 114.
In one embodiment, MCU 120 provides a client configurable video stream containing a scaled version of each of the conference participants. A participant's event handler 124 in MCU 120 is responsible for maintaining state information for each participant 110 and passing this information to media mixer 130 for construction of that participant's view of the conference provided over conference channel 116.
Local Video Loopback
In the multi-participant video conference system 200 shown in
Participants of the defined multi-participant videoconference system 200 receive a single video signal representing a view of the videoconference. Each participant's 210 view may be divided into regions in a video layout chosen by the participant 210 or by the conference system 200. Each region in the video layout is configured to present a media in the conference. For example, in a conference of five participants, a four region video layout may be selected by one of the participants. Each region is configured to show the video contributed to the collaboration by each of the other four participants. A second participant chooses a video layout with a single region and configures it to show the video of the primary speaker participant.
In embodiments of the present invention, participants 210 (see
It is often advantageous for a participant 210 (see
Typically, to enable the self video as a region of the video layout, video is sent from the participant's conference client 212 (see
Although a participant 210 could see the local video in this view along with other participants' 210 video in the other video layout regions, as described above, the process introduces latency through transmission delays and processing time, and the video quality can be reduced as the video signal may be operated on by encoders, decoders, mixers, processors, and affected by transmission limitations including bandwidth or signal loss. Local video viewed in this manner will be degraded, and the reality of the not-so-real-time aspects of the communication system will be more noticeable.
In one embodiment of the present invention, a videoconference system 200 (see
Turning back to
As described above, the MCU 220 includes a media mixer 230 that is configured to assemble audio and video data to be supplied to each conference client 212 from audio and video data received by the media mixer 230 from a plurality of conference clients 212. The media mixer 230 includes a video layout processor 232 configured to generate a composite video image for each of the plurality of conference clients 212. The media mixer 230 also includes an audio distribution processor, also known as an audio signal processor 234, for providing an audio signal for each of the plurality of conference clients 212. MCU 220 includes a connection manager 222 allowing connections of several participants 210 into logical rooms for shared conference communications. The connection manager 222 includes a back-channel controller 226 enabling communication between the client monitor 214 and the MCU 220. The connection manager 222 also includes an event handler 224 configured to insert interface data into an outbound video stream image through the video layout processor.
Embodiments of the present invention define a client component of the multi-participant videoconference system 200 that contains a client video mixer 211. In one embodiment, client video mixer 211 provides the video signal for display by the client component. Inputs to the client video mixer 211 include the local video from the participant's 210 capture device, video from conference channel 216, and notification/position events from client monitor 214.
In one embodiment of the present invention, client monitor 214 notifies MCU 220 across back-channel 218 if a client video mixer 211 is available as part of the client component for a given participant 210. This notification informs the video layout processor 232 within media mixer 230 of the availability of a local video loopback feature within the client component for the given participant 210. In one embodiment, the client informs the server that the client is configured with a client video mixer 211 when the client joins the multi-participant videoconference, for example. In one embodiment, the server requests the information from the client. In one embodiment, notification may be sent to enable or to disable the local video loopback feature for a specific participant 210.
In one embodiment of the invention, the client monitor 214 measures activity over a video frame of the conference client 212 and reports events to the MCU 220 across back-channel 218. The events are relayed to media mixer 230. Events are translated into commands, or server defined actions, and interpreted by the modules within the server component. The video layout processor 232 monitors commands for each client component regarding selection of a video layout and the requested video requirement for each video region within the layout.
A region of a participant's 210 video layout that is configured or identified to contain the local video of the same participant 210 is called a local video region. If a local video region is defined, and the video layout processor 232 has been notified to enable the local video loopback feature for the participant 210, the video layout processor 232 signals the participant's 210 client component through the back-channel communication link 218 with a set of position information. In one embodiment, the position information includes the coordinates (x, y) and the size (width, height) of the local video region within the view of the composite video signal sent from the MCU 220 to the client component across conference channel 216. In other embodiments, position and size information is provided to participant 210 in any manner compatible with and capable of being understood by client and server components of multi-participant video conference system 200. In one embodiment, if the identified local video region is obscured by other regions in the video layout, then the set of position and size information includes the coordinates and size of each overlapping media.
The client monitor 214 receives the position and size information, and then relays the position and size information to the client video mixer 211. The client video mixer 211 combines incoming video for conference client 212 from conference channel 216 with the participant's 210 local video before display on the conference client 212. The position and size information describes an area of the incoming composite conference video to replace with local video in a mixing operation. The resulting video signal is then displayed on the conference client 212.
In one embodiment of the present invention, the server exercises primary control over the local video loopback process and technique. As described in greater detail below, the server manages performance, quality, etc., of the video processing by controlling codec efficiencies, bitrate, etc. The server defines the video layout, whether or not a participant 210 requests a specific layout, and the server defines the region, the location and the size, where the participant local video is to be presented. The server defines the region for local video, provides the coordinates and size of the region to the client monitor 214 over the back-channel 218, and prepares the conference composite video having an identified region for the local video loopback when the feature is available and enabled. In response to the dynamics of a typical videoconference, as video regions change, so does the identified local video region, and the server informs the client monitor 214 of any change.
In one embodiment of the present invention, as changes to a participant's 210 video layout affect the local video region (either by being obscured by other regions, changing location or size, etc.), the video layout processor 232 signals the client monitor 214 with a new set of position information. The client video mixer 211 can make adjustments to the conference video accordingly. If a local video region is no longer configured within a participant's 210 video layout, the video layout processor 232 signals the participant's 210 client component with empty position information (i.e., coordinates and size null or zero). The client video mixer 211 can stop combining local video and pass the conference video to the display unchanged.
Through back-channel notification, the video layout processor 232 is notified that the client component is enabled to replace known regions of the conference video with the participant's 210 local video using a mixing method of the client video mixer 211. As a result, the local video region within the video layout of the composite conference video may define a place holder for the local video. In one embodiment, video layout processor 232 provides client monitor 214 the position and size information over back channel 218 at the time the local video loopback feature is activated, at a time when the video layout changes (i.e., the size of the region changes, the position of the region changes, overlapping regions change, etc.) during the period of activation, and when the local video loopback feature is disabled. In other words, the information is not provided continuously with each video frame delivered while enabled.
Remaining conference video is defined in areas 260 (260 a, 260 b, 260 c). In accordance with an embodiment of the invention, local video region 258 is essentially the entire area of the video display for the selected layout. Conference video areas 260 overlie and obscure regions of the local video, region 258. Position and size information describing the conference video areas 260 will identify the local video region 258 and a set of coordinates and a size for each overlapping region 260.
In addition to local video signal 254 and conference composite video 256, position and size information 264 is provided to client video mixer 262. Position and size information, in one embodiment, includes the coordinates and the size of the local video, region 258, within the view of the conference composite video 256. Since, in system schematic 250, local video, region 258, is obscured by conference video areas 260, the coordinates and size of each area 260 are also provided. In other embodiments, the coordinates and size of the local video region are sufficient to identify the local video region, as illustrated in
Resulting video 266 includes local video 268 and conference video areas 270 (270 a, 270 b, 270 c) in an integrated video display. The resulting video 266 is displayed on conference client 212, and provides the quality and near-real-time features of local video, integrated with the composite conference video for an enhanced multi-participant videoconference environment.
In one embodiment of the present invention, content for the local video region in the composite video signal for the participant 210 (see
In one embodiment, codec efficiencies are gained by implementing the present invention. By way of example, MCU 220 (see
The method continues with operation 304 in which a video layout is defined for conference video. In one embodiment, each participant selects a desired video layout, including content for each of the defined regions in the selected video layout. In one embodiment, the multi-participant videoconference system provides a participant with a selection of video layouts from which to choose. In one embodiment, the multi-participant videoconference system dictates a video layout that will be provided to each participant. In operation 304 the video layout is defined for each participant, whether or not all participant video layout is identical, and however the video layout is chosen.
Next, in operation 306, the server side of the multi-participant videoconference system receives a request from one or more participants to use local video. In one embodiment of the invention, a participant desiring to use local video loopback must have a local, client video mixer. In requesting local video loopback, the participant informs the server of the capability to mix local video into the composite conference video supplied by the server.
The method continues with operation 308 in which a region is identified in the defined video layout for the local video. In one embodiment, the server defines the region in response to client selection. By way of example, client may select a video layout with a desired content for each region in the layout including a region for local video. The server defines the region by, for example, identifying a position and size of a region for local video within the composite conference video. The region may be a discrete, singular region of the composite conference video, or the region may be a cut-away or overlapping region with composite conference video regions obscuring portions of the local video region, or with composite conference video regions being obscured by portions of the local region. In one embodiment, the server defines and identifies regions of the composite conference video including a region for local video loopback without client input. In operation 308, the region for local video loopback is defined within the composite conference video. In one embodiment, the position and size of the defined region is provided by the server to the client over the back-channel.
In operation 310, the server prepares a composite conference video to be supplied to the participant. In one embodiment, the composite conference video is prepared by a video layout processor of the media mixer in the server side of the multi-participant videoconference system. The composite video includes an identified region for local video. In other words, the region of the video layout for local video that was defined in operation 308 is prepared or constructed in the composite video as an area or region of null or empty data, in one embodiment. In one embodiment, the identified region is painted all black, all gray, or some color pattern. In one embodiment, the identified region is painted with a static, stable image such as the last image of video received prior to activating local video loopback, or painted with some other alternative stable data.
The method concludes with operation 312 in which the composite video is transmitted to the participant for mixing and display. The composite video includes the region of null or empty data, or painted all black, all gray, or with some static, stable data, that will contain the locally mixed video by the participant. In one embodiment, the coordinates and size of the identified region are transmitted to the participant over a back-channel connection. In the case of a video layout in which the region identified for local video has overlapping regions of composite conference video, the coordinates and size of each overlapping conference video region is transmitted to the participant across the back-channel connection in addition to the coordinates and size of the local video region. Upon transmission of the composite conference video, the method is done. The method operations performed by the client side of the multi-participant videoconference system are described below in reference to
The method proceeds with operation 324 in which local video is captured by the participant. In one embodiment, the capturing of local video by a participant, and the forwarding of the captured video to the server is usual practice for a contributing participant of a multi-participant videoconference. In operation 324, video is captured according to usual practice, and the locally captured video is transmitted to both the local client video mixer and to the server for integration into the composite conference video.
In operation 326, the participant receives a size and location in a video layout for local video. In one embodiment, the size and location information is received by the client component over the back-channel. Size and location information defines a region of the composite conference video for local video. In one embodiment, the size and location information is received as a set of coordinates (x, y) defining the location, and a dimension (width, height) defining the size. In one embodiment, if the region defined for local video obscures, or is obscured by any other region defined in the composite conference video, the size and location of the affected region are also received. In one embodiment, the size and location information is used by the client video mixer to mix local video loopback into the composite conference video.
The method continues with operation 328 in which a composite video of the multi-participant videoconference, prepared by the server in the media mixer, is received by the participant across the conference channel. The client, being appropriately configured for local video loopback and having selected a local video loopback option, receives a composite conference video signal from the server with an identified region for local video from the local video loopback. In one embodiment, the identified region is painted with null or empty data. In one embodiment, the identified region is painted all black, or all gray, etc. In one embodiment, the identified region contains the same local video that was transmitted to the server, processed, and will be ultimately replaced by the participant local video processed in the client video mixer. In one embodiment, a single frame of the local video, or some other video, is painted in the identified region, eliminating frame changes and which will be replace by participant local video processed in the client video mixer.
Next, in operation 330, the local video is integrated into the composite conference video by the client. In one embodiment, the client receives a local video signal into the client video mixer, and receives a composite conference video into the client video mixer. The composite conference video for the particular client includes a region identified in some manner such as by null or empty data, painted all black or all gray, etc., defined for the local video using coordinate and size information delivered by the server across the back-channel. In operation 328, the local video is integrated into the composite conference video using the coordinate and size information, and any applicable overlapping coordinate and size information, delivered by the server into the region identified for local video in the client video mixer.
The method concludes with operation 332 in which the video, incorporating the local video in the region so defined, is displayed by the conference client. Upon display of the integrated video, the method is done.
In summary, the above described invention provides a multi-participant videoconference system having a back-channel network, and implementing a method for local video loopback. The system provides for preparing and transmitting, by the server to the client, composite conference video with a region defined for local video, the region being defined by empty or null data. The client then mixes local video into the composite conference video using a client video mixer. The local video is integrated into the defined region, providing the client an option to define the overall composite conference video that client will monitor, and using local video in place of conference composite video, overcoming the inherent delays of transmission, encoding, decoding, scaling, etc. Further, server performance is increased and codec efficiency is gained due to decreasing the demand for resources and more efficient utilization of bandwidth by defining a region of null or empty data in the composite conference video signal. A more efficient allocation of finite resources results, along with correspondingly improved quality, frame rate, etc.
With the above embodiments in mind, it should be understood that the invention may employ various computer-implemented operations involving data stored in computer systems. These operations are those requiring physical manipulation of physical quantities. Usually, though not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated. Further, the manipulations performed are often referred to in terms, such as producing, identifying, determining, or comparing.
The invention can also be embodied as computer readable code on a computer readable medium. The computer readable medium is any data storage device that can store data which can be thereafter read by a computer system. Examples of the computer readable medium include hard drives, network attached storage (NAS), read-only memory, random-access memory, CD-ROMs, CD-Rs, CD-RWs, magnetic tapes, and other optical and non-optical data storage devices. The computer readable medium can also be distributed over a network coupled computer systems so that the computer readable code is stored and executed in a distributed fashion.
Although the foregoing invention has been described in some detail for purposes of clarity of understanding, it will be apparent that certain changes and modifications may be practiced within the scope of the appended claims. Accordingly, the present embodiments are to be considered as illustrative and not restrictive, and the invention is not to be limited to the details given herein, but may be modified within the scope and equivalents of the appended claims.