Search Images Maps Play YouTube News Gmail Drive More »
Sign in
Screen reader users: click this link for accessible mode. Accessible mode has the same essential features but works better with your reader.

Patents

  1. Advanced Patent Search
Publication numberUS20040001546 A1
Publication typeApplication
Application numberUS 10/444,511
Publication dateJan 1, 2004
Filing dateMay 23, 2003
Priority dateJun 3, 2002
Also published asCA2430460A1, CA2430460C, CN1471320A, CN100455018C, EP1369820A2, EP1369820A3, US8374245, US8873630, US9185427, US20070014358, US20130148737, US20150016527, US20160134890
Publication number10444511, 444511, US 2004/0001546 A1, US 2004/001546 A1, US 20040001546 A1, US 20040001546A1, US 2004001546 A1, US 2004001546A1, US-A1-20040001546, US-A1-2004001546, US2004/0001546A1, US2004/001546A1, US20040001546 A1, US20040001546A1, US2004001546 A1, US2004001546A1
InventorsAlexandros Tourapis, Shipeng Li, Feng Wu
Original AssigneeAlexandros Tourapis, Shipeng Li, Feng Wu
Export CitationBiBTeX, EndNote, RefMan
External Links: USPTO, USPTO Assignment, Espacenet
Spatiotemporal prediction for bidirectionally predictive (B) pictures and motion vector prediction for multi-picture reference motion compensation
US 20040001546 A1
Abstract
Several improvements for use with Bidirectionally Predictive (B) pictures within a video sequence are provided. In certain improvements Direct Mode encoding and/or Motion Vector Prediction are enhanced using spatial prediction techniques. In other improvements Motion Vector prediction includes temporal distance and subblock information, for example, for more accurate prediction. Such improvements and other presented herein significantly improve the performance of any applicable video coding system/logic.
Images(12)
Previous page
Next page
Claims(117)
What is claimed is:
1. A method for use in encoding video data within a sequence of video frames, the method comprising:
identifying at least a portion of at least one video frame to be a Bidirectionally Predictive (B) picture; and
selectively encoding said B picture using at least spatial prediction to encode at least one motion parameter associated with said B picture.
2. The method as recited in claim 1, wherein said B picture includes a macroblock.
3. The method as recited in claim 2, wherein selectively encoding said B picture using at least spatial prediction to encode said at least one motion parameter produces a Direct Macroblock.
4. The method as recited in claim 1, wherein said B picture includes a slice.
5. The method as recited in claim 1, wherein said B picture includes at least a portion of a macroblock.
6. The method as recited in claim 1, wherein selectively encoding said B picture using at least spatial prediction to encode said at least one motion parameter further includes employing linear motion vector prediction for said B picture based on at least one reference picture that is at least another portion of said video frame.
7. The method as recited in claim 1, wherein selectively encoding said B picture using at least spatial prediction to encode said at least one motion parameter further includes employing non-linear motion vector prediction for said B picture based on at least one reference picture that is at least another portion of said video frame.
8. The method as recited in claim 1, wherein selectively encoding said B picture using at least spatial prediction to encode said at least one motion parameter further includes employing median motion vector prediction for said B picture based on at least two reference pictures that are both portions of said video frame.
9. The method as recited in claim 1, wherein said at least one motion parameter includes at least one motion vector.
10. The method as recited in claim 1, wherein at least one other portion of at least one other video frame is processed to further selectively encode said B picture using temporal prediction to encode at least one temporal-based motion parameter associated with said B picture.
11. The method as recited in claim 10, wherein said temporal prediction includes bidirectional temporal prediction.
12. The method as recited in claim 10, wherein said at least one other video frame is a Predictive (P) frame.
13. The method as recited in claim 10, further comprising selectively scaling said at least one temporal-based motion parameter based at least in part on a temporal distance between said other video frame and said frame that includes said B picture.
14. The method as recited in claim 13, wherein temporal distance information is encoded within a header associated with said encoded B picture.
15. The method as recited in claim 10, wherein said at least one other portion includes at least a portion of a macroblock within said at least one other video frame.
16. A computer-readable medium having computer implementable instructions for configuring at least one processing unit to perform acts comprising:
accessing data for a sequence of video frames;
identifying at least a portion of at least one video frame to be a Bidirectionally Predictive (B) picture; and
selectively encoding said B picture using at least spatial prediction to encode at least one motion parameter associated with said B picture.
17. The computer-readable medium as recited in claim 16, wherein said B picture includes a macroblock.
18. The computer-readable medium as recited in claim 17, wherein selectively encoding said B picture using at least spatial prediction to encode said at least one motion parameter produces a Direct Macroblock.
19. The computer-readable medium as recited in claim 16, wherein said B picture includes a slice.
20. The computer-readable medium as recited in claim 16, wherein said B picture includes at least a portion of a macroblock.
21. The computer-readable medium as recited in claim 16, wherein selectively encoding said B picture using at least spatial prediction to encode said at least one motion parameter further includes employing linear motion vector prediction for said B picture based on at least one reference picture that is at least another portion of said video frame.
22. The computer-readable medium as recited in claim 16, wherein selectively encoding said B picture using at least spatial prediction to encode said at least one motion parameter further includes employing non-linear motion vector prediction for said B picture based on at least one reference picture that is at least another portion of said video frame.
23. The computer-readable medium as recited in claim 16, wherein selectively encoding said B picture using at least spatial prediction to encode said at least one motion parameter further includes employing median motion vector prediction for said B picture based on at least two reference pictures that are both portions of said video frame.
24. The computer-readable medium as recited in claim 16, wherein said at least one motion parameter includes at least one motion vector.
25. The computer-readable medium as recited in claim 1, wherein at least one other portion of at least one other video frame is processed to further selectively encode said B picture using temporal prediction to encode at least one temporal-based motion parameter associated with said B picture.
26. The computer-readable medium as recited in claim 25, wherein said temporal prediction includes bidirectional temporal prediction.
27. The computer-readable medium as recited in claim 25, wherein said at least one other video frame is a Predictive (P) frame.
28. The computer-readable medium as recited in claim 25, having computer implementable instructions for configuring said at least one processing unit to perform acts comprising:
selectively scaling said at least one temporal-based motion parameter based at least in part on a temporal distance between said other video frame and said frame that includes said B picture.
29. The computer-readable medium as recited in claim 28, wherein temporal distance information is encoded within a header associated with said encoded B picture.
30. The computer-readable medium as recited in claim 25, wherein said at least one other portion includes at least a portion of a macroblock within said at least one other video frame.
31. An apparatus for use in encoding video data within a sequence of video frames, the apparatus comprising:
logic operatively configured to access video data for a sequence of video frames, identify at least a portion of at least one video frame to be a Bidirectionally Predictive (B) picture, and selectively encode said B picture using at least spatial prediction to encode at least one motion parameter associated with said B picture.
32. The apparatus as recited in claim 31, wherein said B picture includes a macroblock.
33. The apparatus as recited in claim 32, wherein said logic selectively encodes said B picture using at least spatial prediction to encode said at least one motion parameter to produce a Direct Macroblock.
34. The apparatus as recited in claim 31, wherein said B picture includes a slice.
35. The apparatus as recited in claim 31, wherein said B picture includes at least a portion of a macroblock.
36. The apparatus as recited in claim 31, wherein said logic is further configured to employ linear motion vector prediction for said B picture based on at least one reference picture that is at least another portion of said video frame.
37. The apparatus as recited in claim 31, wherein said logic is further configured to employ non-linear motion vector prediction for said B picture based on at least one reference picture that is at least another portion of said video frame.
38. The apparatus as recited in claim 31, wherein said logic is further configured to employ median motion vector prediction for said B picture based on at least two reference pictures that are both portions of said video frame.
39. The apparatus as recited in claim 31, wherein said at least one motion parameter includes at least one motion vector.
40. The apparatus as recited in claim 31, wherein said logic is further configured to process at least one other portion of at least one other video frame is and selectively encode said B picture using temporal prediction to encode at least one temporal-based motion parameter associated with said B picture.
41. The apparatus as recited in claim 40, wherein said temporal prediction includes bidirectional temporal prediction.
42. The apparatus as recited in claim 40, wherein said at least one other video frame is a Predictive (P) frame.
43. The apparatus as recited in claim 40, wherein said logic is further configured to selectively scale said at least one temporal-based motion parameter based at least in part on a temporal distance between said other video frame and said frame that includes said B picture.
44. The apparatus as recited in claim 43, whereinsaid logic is further configured to include temporal distance information within a header associated with said encoded B picture.
45. The apparatus as recited in claim 40, wherein said at least one other portion includes at least a portion of a macroblock within said at least one other video frame.
46. A method for encoding video data, the method comprising:
identifying at least a portion of at least one video frame to be coded in an enhanced direct mode; and
encoding said portion in said enhanced direct mode using at least spatial information associated with said portion within said at least one video frame.
47. The method as recited in claim 46 wherein encoding said portion in said enhanced direct mode further includes using temporal information associated with said portion and at least one other portion of at least one other video frame.
48. The method as recited in claim 46, wherein encoding said portion in said enhanced direct mode further includes using motion vector prediction based on at least one other portion within said at least one video frame.
49. The method as recited in Clam 48, wherein said motion vector prediction includes median prediction.
50. The method as recited in claim 46, wherein said enhance direct mode includes using spatial prediction to calculate said spatial information based on at least one linear function that considers motion information of at least one other portion of said at least one video frame.
51. The method as recited in claim 46, wherein said enhance direct mode includes using spatial prediction to calculate said spatial information based on at least one non-linear function that considers motion information of at least one other portion of said at least one video frame.
52. A computer-readable medium having computer implementable instructions for configuring at least one processing unit to perform acts comprising encoding video data by identifying at least a portion of at least one video frame to be coded in an enhanced direct mode, and encoding said portion in said enhanced direct mode using at least spatial information associated with said portion within said at least one video frame.
53. The computer-readable medium as recited in claim 52 wherein encoding said portion in said enhanced direct mode further includes using temporal information associated with said portion and at least one other portion of at least one other video frame.
54. The computer-readable medium as recited in claim 52, wherein encoding said portion in said enhanced direct mode further includes using motion vector prediction based on at least one other portion within said at least one video frame.
55. The computer-readable medium as recited in Clam 54, wherein said motion vector prediction includes median prediction.
56. The computer-readable medium as recited in claim 52, wherein said enhance direct mode includes using spatial prediction to calculate said spatial information based on at least one linear function that considers motion information of at least one other portion of said at least one video frame.
57. The computer-readable medium as recited in claim 52, wherein said enhance direct mode includes using spatial prediction to calculate said spatial information based on at least one non-linear function that considers motion information of at least one other portion of said at least one video frame.
58. An apparatus comprising:
logic operatively configured to encode video data by identifying at least a portion of at least one video frame to be coded in an enhanced direct mode, and encode said portion in said enhanced direct mode using at least spatial information Is associated with said portion within said at least one video frame.
59. The apparatus as recited in claim 58 wherein said logic is further operatively configured to use temporal information associated with said portion and at least one other portion of at least one other video frame to encode said portion in said enhanced direct mode.
60. The apparatus as recited in claim 58, wherein said logic is further operatively configured to encoding said portion in said enhanced direct mode further using motion vector prediction information based on at least one other portion within said at least one video frame.
61. The apparatus as recited in Clam 60, wherein said motion vector prediction includes median prediction.
62. The apparatus as recited in claim 56, wherein said logic is further operatively configured to use spatial prediction to calculate said spatial information based on at least one linear function that considers motion information of at least one other portion of said at least one video frame.
63. The apparatus as recited in claim 56, wherein said logic is further operatively configured to use spatial prediction to calculate said spatial information based on at least one non-linear function that considers motion information of at least one other portion of said at least one video frame.
64. A method to predict a reference picture in direct mode video encoding, the method comprising:
selecting a reference picture from a group comprising a minimum reference picture for a plurality of predictions related to at least a portion of a video frame to be encoded, a median reference picture for said plurality of predictions, and a current reference picture based on a single direction prediction; and
encoding said at least one portion of said video frame based on selected reference picture.
65. The method as recited in claim 64, wherein selecting said reference picture further includes selecting at least one spatially related prediction.
66. The method as recited in claim 64, wherein selecting said reference picture further includes selecting at least one temporally related prediction.
67. A computer-readable medium having computer implementable instructions for configuring at least one processing unit to perform acts comprising:
selecting a reference picture from a group comprising a minimum reference picture for a plurality of predictions related to at least a portion of a video frame to be encoded, a median reference picture for said plurality of predictions, and a current reference picture based on a single direction prediction; and
encoding said at least one portion of said video frame based on selected reference picture.
68. The computer-readable medium as recited in claim 67, wherein selecting said reference picture further includes selecting at least one spatially related prediction.
69. The computer-readable medium as recited in claim 67, wherein selecting said reference picture further includes selecting at least one temporally related prediction.
70. An apparatus comprising:
logic that is operatively configured to select a reference picture from a group comprising a minimum reference picture for a plurality of predictions related to at least a portion of a video frame to be encoded, a median reference picture for said plurality of predictions, and a current reference picture based on a single direction prediction, and encode said at least one portion of said video frame based on selected reference picture.
71. The apparatus as recited in claim 70, wherein said logic is operatively configured to select at least one spatially related prediction.
72. The apparatus as recited in claim 70, wherein said logic is operatively configured to select at least one temporally related prediction.
73. A method for use in selecting between temporal prediction, spatial prediction, or both temporal and spatial prediction for encoding at least a portion of at least one video frame in an enhanced direct mode, the method comprising:
selecting temporal prediction if at least one motion vector of a collocated portion of said video frame is zero;
if surrounding portions within said video frame use different reference pictures than a collocated reference picture, then select spatial prediction only;
if a motion flow associated with said portion of said video frame is substantially different than a motion flow associated with a reference picture, then select spatial prediction;
if temporal prediction of direct mode is signaled inside an image header, then selecting temporal prediction; and
if spatial prediction of direct mode is signaled inside said image header, then selecting spatial prediction.
74. The method as recited in claim 73, further comprising:
correcting at least one temporally predicted parameter based on spatial information.
75. The method as recited in claim 73, further comprising:
correcting at least one spatially predicted parameter based on temporal information.
76. A computer-readable medium having computer implementable instructions for configuring at least one processing unit to perform acts comprising:
selecting between temporal prediction, spatial prediction, or both temporal and spatial prediction for encoding at least a portion of at least one video frame in an enhanced direct mode, such that:
temporal prediction is selected if at least one motion vector of a collocated portion of said video frame is zero,
only spatial prediction is selected if surrounding portions within said video frame use different reference pictures than a collocated reference picture,
spatial prediction is selected if a motion flow associated with said portion of said video frame is substantially different than a motion flow associated with a reference picture,
temporal prediction is selected if temporal prediction of direct mode is signaled inside an image header, and
spatial prediction is selected if spatial prediction of direct mode is signaled inside said image header.
77. The computer-readable medium as recited in claim 76, further comprising:
correcting at least one temporally predicted parameter based on spatial information.
78. The computer-readable medium as recited in claim 76, further comprising:
correcting at least one spatially predicted parameter based on temporal information.
79. An apparatus comprising:
logic operatively configured to select between and employ temporal prediction, spatial prediction, or both temporal and spatial prediction for encoding at least a portion of at least one video frame in an enhanced direct mode, wherein said logic:
selects temporal prediction if at least one motion vector of a collocated portion of said video frame is zero,
selects only spatial prediction if surrounding portions within said video frame use different reference pictures than a collocated reference picture,
selects spatial prediction if a motion flow associated with said portion of said video frame is substantially different than a motion flow associated with a reference picture,
selects temporal prediction if temporal prediction of direct mode is signaled inside an image header, and
selects spatial prediction if spatial prediction of direct mode is signaled inside said image header.
80. The apparatus as recited in claim 79, wherein said logic is further operatively configured to correct at least one temporally predicted parameter based on spatial information.
81. The apparatus as recited in claim 79, wherein said logic is further operatively configured to correct at least one spatially predicted parameter based on temporal information.
82. A method for use in encoding video data, the method comprising:
selecting a reference portion of a future video frame to serve as a B picture to at least one portion of an earlier video frame;
using motion vectors associated with said reference frame to calculate motion vectors associated with said at least one portion; and
encoding said at least one portion based on said calculated motion vectors associated with said at least one portion.
83. A method as recited in claim 82, wherein using said motion vectors associated with said reference frame to calculate said motion vectors associated with said at least one portion further includes estimating at least one possible prediction for use in direct mode coding by projecting and inverting backward and forward motion vectors of the reference portion.
84. The method as recited in claim 83, wherein encoding said at least one portion based on said calculated motion vectors associated with said at least one portion further includes applying selective projection and inversion based on at least one temporal parameter associated with said reference portion with respect to said at least one portion.
85. The method as recited in claim 82, wherein only one reference portion is used for B pictures when encoding in direct mode.
86. The method as recited in claim 82, wherein encoding said at least one portion based on said calculated motion vectors associated with said at least one portion further includes encoding in a direct mode wherein at least one of said calculated motion vectors is based on at least one projected motion vector that refers to at least two reference portions in two different reference pictures.
87. The method as recited in claim 82, wherein encoding said at least one portion based on said calculated motion vectors associated with said at least one portion further includes encoding in a direct mode wherein at least one of said calculated motion vectors is based on spatial prediction associated with said reference portion.
88. A computer-readable medium having computer implementable instructions for configuring at least one processing unit to perform acts comprising:
selecting a reference portion of a future video frame to serve as a B picture to at least one portion of an earlier video frame;
using motion vectors associated with said reference frame to calculate motion vectors associated with said at least one portion; and
encoding said at least one portion based on said calculated motion vectors associated with said at least one portion.
89. A computer-readable medium as recited in claim 88, wherein using said motion vectors associated with said reference frame to calculate said motion vectors associated with said at least one portion further includes estimating at least one possible prediction for use in direct mode coding by projecting and inverting backward and forward motion vectors of the reference portion.
90. The computer-readable medium as recited in claim 89, wherein encoding said at least one portion based on said calculated motion vectors associated with said at least one portion further includes applying selective projection and inversion based on at least one temporal parameter associated with said reference portion with respect to said at least one portion.
91. The computer-readable medium as recited in claim 88, wherein only one reference portion is used for B pictures when encoding in direct mode.
92. The computer-readable medium as recited in claim 88, wherein encoding said at least one portion based on said calculated motion vectors associated with said at least one portion further includes encoding in a direct mode wherein at least one of said calculated motion vectors is based on at least one projected motion vector that refers to at least two reference portions in two different reference pictures.
93. The computer-readable medium as recited in claim 88, wherein encoding said at least one portion based on said calculated motion vectors associated with said at least one portion further includes encoding in a direct mode wherein at least one of said calculated motion vectors is based on spatial prediction associated with said reference portion.
94. An apparatus comprising:
logic operatively configured to select a reference portion of a future video frame to serve as a B picture to at least one portion of an earlier video frame, use motion vectors associated with said reference frame to calculate motion vectors associated with said at least one portion, and encode said at least one portion based on said calculated motion vectors associated with said at least one portion.
95. The apparatus as recited in claim 94, wherein said logic is further operatively configured to estimate at least one possible prediction for use in direct mode coding by projecting and inverting backward and forward motion vectors of the reference portion.
96. The apparatus as recited in claim 95, wherein said logic is further operatively configured to apply selective projection and inversion based on at least one temporal parameter associated with said reference portion with respect to said at least one portion.
97. The apparatus as recited in claim 94, wherein only one reference portion is used for B pictures when encoding in direct mode.
98. The apparatus as recited in claim 94, wherein said logic is further operatively configured to encode in a direct mode wherein at least one of said calculated motion vectors is based on at least one projected motion vector that refers to at least two reference portions in two different reference pictures.
99. The apparatus as recited in claim 94, wherein said logic is further operatively configured to encode in a direct mode wherein at least one of said calculated motion vectors is based on spatial prediction associated with said reference portion.
100. A method for use in determining motion vectors during video encoding, the method comprising:
selecting at least three predictors A, B and C that each uses a different reference picture having an associated temporal distance TRA, TRB, and TRC respectively, and a motion vector MVA, MVB, and MVC; and
predicting a median motion vector MVpred associated with a current reference picture that has a temporal distance equal to TR.
101. The method as recited in claim 100, wherein said median predictor MVpred is calculated as:
MV pred = TR Median ( MV A TR A , MV B TR B , MV C TR C ) .
102. The method as recited in claim 100, wherein said median predictor MVpred is calculated as:
MV pred=Median(Ave({right arrow over (MV)} C 1 ,{right arrow over (MV)} C 2 ),Ave({right arrow over (MV)} A 1 , {right arrow over (MV)} A 2 ),{right arrow over (MV)} B).
103. The method as recited in claim 100, further comprising:
selecting at least a fourth predictor D having an associated temporal distance TRD and a motion vector MVD, and wherein said median predictor MVpred is calculated as:
{right arrow over (MV)} pred=Median(Median({right arrow over (MV)} C 1 ,{right arrow over (MV)} C 1 ,{right arrow over (MV)} D), . . . Median({right arrow over (MV)} D ,{right arrow over (MV)} A 1 ,{right arrow over (MV)}C 1 ),Median({right arrow over (MV)} B ,{right arrow over (MV)} A 1 ,{right arrow over (MV)} A 2 ))
104. The method as recited in claim 100, further comprising:
selecting at least a fourth predictor D having an associated temporal distance TRD and a motion vector MVD, and wherein said median predictor MVpred is calculated as:
{right arrow over (MV)} pred=Median({right arrow over (MV)} C 1 ,{right arrow over (MV)} C 2 , {right arrow over (MV)} D , {right arrow over (MV)} B ,{right arrow over (MV)} A 1 , {right arrow over (MV)} A 2 ).
105. The method as recited in claim 100, further comprising selectively substituting an adjacent portion of a reference frame for a selected portion of said reference frame for use in determining motion vector prediction when intra coding is used.
106. A computer-readable medium having computer implementable instructions for configuring at least one processing unit to perform acts comprising:
selecting at least three predictors A, B and C that each uses a different reference picture having an associated temporal distance TRA, TRB, and TRC respectively, and a motion vector MVA, MVB, and MVC; and
predicting a median motion vector MVpred associated with a current reference picture that has a temporal distance equal to TR.
107. The computer-readable medium as recited in claim 106, wherein said median predictor MVpred is calculated as:
MV pred = TR Median ( MV A TR A , MV B TR B , MV C TR C ) .
108. The computer-readable medium as recited in claim 106, wherein said median predictor MVpred is calculated as:
{right arrow over (MV)} pred=Median(Ave({right arrow over (MV)} C 1 ,{right arrow over (MV)} C 2 ), Ave({right arrow over (MV)}A 1 , {right arrow over (MV)}A 2 ),{right arrow over (MV)}B).
109. The computer-readable medium as recited in claim 106, further comprising:
selecting at least a fourth predictor D having an associated temporal distance TRD and a motion vector MVD, and wherein said median predictor MVpred is calculated as:
{right arrow over (MV)} pred=Median(Median({right arrow over (MV)} C 1 ,{right arrow over (MV)} C 2 ,{right arrow over (MV)} D), . . . Median({right arrow over (MV)} D ,{right arrow over (MV)} A 1 ,{right arrow over (MV)} C 2 ),Median({right arrow over (MV)} B ,{right arrow over (MV)} A 1 ,{right arrow over (MV)} A 2 ))
110. The computer-readable medium as recited in claim 106, further comprising:
selecting at least a fourth predictor D having an associated temporal distance TRD and a motion vector MVD, and wherein said median predictor MVpred is calculated as:
{right arrow over (MV)} pred=Median({right arrow over (MV)} C 1 ,{right arrow over (MV)} C 2 ,{right arrow over (MV)} D ,{right arrow over (MV)} B ,{right arrow over (MV)} A 1 ,{right arrow over (MV)} A 2 )
111. The computer-readable medium as recited in claim 106, further comprising selectively substituting an adjacent portion of a reference frame for a selected portion of said reference frame for use in determining motion vector prediction when intra coding is used.
112. An apparatus comprising logic operatively configured to select at least three predictors A, B and C that each uses a different reference picture having an associated temporal distance TRA, TRB, and TRC respectively, and a motion vector MVA, MVB, and MVC, and predict a median motion vector MVpred associated with a current reference picture that has a temporal distance equal to TR.
113. The apparatus as recited in claim 112, wherein said median predictor MVpred is calculated as:
MV pred = TR Median ( MV A TR A , MV B TR B , MV C TR C ) .
114. The apparatus as recited in claim 112, wherein said median predictor MVpred is calculated as:
{right arrow over (MV)} pred=Median(Ave({fraction (MV)} C 1 , {right arrow over (MV)} C 2 ), Ave(MV A 1 ,{right arrow over (MV)} A 2 ),{right arrow over (MV)}B).
115. The apparatus as recited in claim 112, wherein said logic is further operatively configured to select at least a fourth predictor D having an associated temporal distance TRD and a motion vector MVD, and wherein said median predictor MVpred is calculated as:
{right arrow over (MV)} pred=Median(Median({right arrow over (MV)} C 1 ,{right arrow over (MV)} C 2 ,{right arrow over (MV)} D), . . . Median({right arrow over (MV)} D ,{right arrow over (MV)} A 1 ,{right arrow over (MV)} C 2 ),Median({right arrow over (MV)} B ,{right arrow over (MV)} A 1 ,{right arrow over (MV)} A 2 ))
116. The apparatus as recited in claim 112, wherein said logic is further operatively configured to select at least a fourth predictor D having an associated temporal distance TRD and a motion vector MVD, and wherein said median predictor MVpred is calculated as:
{right arrow over (MV)} Pred=Median({right arrow over (MV)} C 1 , {right arrow over (MV)} C 2 ,{right arrow over (MV)} D ,{right arrow over (MV)} B ,{right arrow over (MV)} A 1 ,{right arrow over (MV)} A 2 ).
117. The apparatus as recited in claim 112, wherein said logic is further operatively configured to selectively substitute an adjacent portion of a reference frame for a selected portion of said reference frame for use in determining motion vector prediction when intra coding is used.
Description
    RELATED PATENT APPLICATIONS
  • [0001]
    This U.S. Non-provisional Application for Letters Patent claims the benefit of priority from, and hereby incorporates by reference the entire disclosure of, co-pending U.S. Provisional Application for Letters Patent Serial No. 60/385,965, filed Jun. 3, 2002, and titled “Spatiotemporal Prediction for Bidirectionally Predictive (B) Frames and Motion Vector Prediction for Multi-Frame Reference Motion Compensation”.
  • TECHNICAL FIELD
  • [0002]
    This invention relates to video coding, and more particularly to methods and apparatuses for providing improved coding and/or prediction techniques associated with different types of video data.
  • BACKGROUND
  • [0003]
    The motivation for increased coding efficiency in video coding has led to the adoption in the Joint Video Team (JVT) (a standards body) of more refined and complicated models and modes describing motion information for a given macroblock. These models and modes tend to make better advantage of the temporal redundancies that may exist within a video sequence. See, for example, ITU-T, Video Coding Expert Group (VCEG), “JVT Coding—(ITU-T H.26L & ISO/IEC JTC1 Standard)—Working Draft Number 2 (WD-2)”, ITU-T JVT-B 118, March 2002; and/or Heiko Schwarz and Thomas Wiegand, “Tree-structured macroblock partition”, Doc. VCEG-N17, December 2001.
  • [0004]
    There is continuing need for further improved methods and apparatuses that can support the latest models and modes and also possibly introduce new models and modes to take advantage of improved coding techniques.
  • SUMMARY
  • [0005]
    The above state needs and other are addressed, for example, by a method for use in encoding video data within a sequence of video frames. The method includes identifying at least a portion of at least one video frame to be a Bidirectionally Predictive (B) picture, and selectively encoding the B picture using at least spatial prediction to encode at least one motion parameter associated with the B picture. In certain exemplary implementations the B picture may include a block, a macroblock, a subblock, a slice, or other like portion of the video frame. For example, when a macroblock portion is used, the method produces a Direct Macroblock.
  • [0006]
    In certain further exemplary implementations, the method further includes employing linear or non-linear motion vector prediction for the B picture based on at least one reference picture that is at least another portion of the video frame. By way of example, in certain implementations, the method employs median motion vector prediction to produce at least one motion vector.
  • [0007]
    In still other exemplary implementations, in addition to spatial prediction, the method may also process at least one other portion of at least one other video frame to further selectively encode the B picture using temporal prediction to encode at least one temporal-based motion parameter associated with the B picture. In some instances the temporal prediction includes bidirectional temporal prediction, for example based on at least a portion of a Predictive (P) frame.
  • [0008]
    In certain other implementations, the method also selectively determines applicable scaling for a temporal-based motion parameter based at least in part on a temporal distance between the predictor video frame and the frame that includes the B picture. In certain implementations temporal distance information is encoded, for example, within a header or other like data arrangement associated with the encoded B picture.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • [0009]
    The present invention is illustrated by way of example and not limitation in the figures of the accompanying drawings. The same numbers are used throughout the figures to reference like components and/or features.
  • [0010]
    [0010]FIG. 1 is a block diagram depicting an exemplary computing environment that is suitable for use with certain implementations of the present invention.
  • [0011]
    [0011]FIG. 2 is a block diagram depicting an exemplary representative device that is suitable for use with certain implementations of the present invention.
  • [0012]
    [0012]FIG. 3 is an illustrative diagram depicting spatial predication associated with portions of a picture, in accordance with certain exemplary implementations of the present invention.
  • [0013]
    [0013]FIG. 4 is an illustrative diagram depicting Direct Prediction in B picture coding, in accordance with certain exemplary implementations of the present invention.
  • [0014]
    [0014]FIG. 5 is an illustrative diagram depicting what happens when a scene change happens or even when the collocated block is intra-coded, in accordance with certain exemplary implementations of the present invention.
  • [0015]
    [0015]FIG. 6 is an illustrative diagram depicting handling of collocated intra within existing codecs wherein motion is assumed to be zero, in accordance with certain exemplary implementations of the present invention.
  • [0016]
    [0016]FIG. 7 is an illustrative diagram depicting how Direct Mode is handled when the reference picture of the collocated block in the subsequent P picture is other than zero, in accordance with certain exemplary implementations of the present invention.
  • [0017]
    [0017]FIG. 8 is an illustrative diagram depicting an exemplary scheme wherein MVFW and MVBW are derived from spatial prediction, in accordance with certain exemplary implementations of the present invention.
  • [0018]
    [0018]FIG. 9 is an illustrative diagram depicting how spatial prediction solves the problem of scene changes and the like, in accordance with certain exemplary implementations of the present invention.
  • [0019]
    [0019]FIG. 10 is an illustrative diagram depicting joint spatio-temporal prediction for Direct Mode in B picture coding, in accordance with certain exemplary implementations of the present invention.
  • [0020]
    [0020]FIG. 11 is an illustrative diagram depicting Motion Vector Prediction of a current block considering reference picture information of predictor macroblocks, in accordance with certain exemplary implementations of the present invention.
  • [0021]
    [0021]FIG. 12 is an illustrative diagram depicting how to use more candidates for Direct Mode prediction especially if bidirectional prediction is used within the B picture, in accordance with certain exemplary implementations of the present invention.
  • [0022]
    [0022]FIG. 13 is an illustrative diagram depicting how B pictures may be restricted in using future and past reference pictures, in accordance with certain exemplary implementations of the present invention.
  • [0023]
    [0023]FIG. 14 is an illustrative diagram depicting projection of collocated Motion Vectors to a current reference for temporal direct prediction, in accordance with certain exemplary implementations of the present invention.
  • [0024]
    [0024]FIGS. 15a-c are illustrative diagrams depicting Motion Vector Predictors for one MV in different configurations, in accordance with certain exemplary implementations of the present invention.
  • [0025]
    [0025]FIGS. 16a-c are illustrative diagrams depicting Motion Vector Predictors for one MV with 88 partitions in different configurations, in accordance with certain exemplary implementations of the present invention.
  • [0026]
    [0026]FIGS. 17a-c are illustrative diagrams depicting Motion Vector Predictors for one MV with additional predictors for 88 partitioning, in accordance with certain exemplary implementations of the present invention.
  • DETAILED DESCRIPTION
  • [0027]
    Several improvements for use with Bidirectionally Predictive (B) pictures within a video sequence are described below and illustrated in the accompanying drawings. In certain improvements Direct Mode encoding and/or Motion Vector Prediction are enhanced using spatial prediction techniques. In other improvements Motion Vector prediction includes temporal distance and subblock information, for example, for more accurate prediction. Such improvements and other presented herein significantly improve the performance of any applicable video coding system/logic.
  • [0028]
    While these and other exemplary methods and apparatuses are described, it should be kept in mind that the techniques of the present invention are not limited to the examples described and shown in the accompanying drawings, but are also clearly adaptable to other similar existing and future video coding schemes, etc.
  • [0029]
    Before introducing such exemplary methods and apparatuses, an introduction is provided in the following section for suitable exemplary operating environments, for example, in the form of a computing device and other types of devices/appliances.
  • [0030]
    Exemplary Operational Environments:
  • [0031]
    Turning to the drawings, wherein like reference numerals refer to like elements, the invention is illustrated as being implemented in a suitable computing environment. Although not required, the invention will be described in the general context of computer-executable instructions, such as program modules, being executed by a personal computer.
  • [0032]
    Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. Those skilled in the art will appreciate that the invention may be practiced with other computer system configurations, including hand-held devices, multi-processor systems, microprocessor based or programmable consumer electronics, network PCs, minicomputers, mainframe computers, portable communication devices, and the like.
  • [0033]
    The invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote memory storage devices.
  • [0034]
    [0034]FIG. 1 illustrates an example of a suitable computing environment 120 on which the subsequently described systems, apparatuses and methods may be implemented. Exemplary computing environment 120 is only one example of a suitable computing environment and is not intended to suggest any limitation as to the scope of use or functionality of the improved methods and systems described herein. Neither should computing environment 120 be interpreted as having any dependency or requirement relating to any one or combination of components illustrated in computing environment 120.
  • [0035]
    The improved methods and systems herein are operational with numerous other general purpose or special purpose computing system environments or configurations. Examples of well known computing systems, environments, and/or configurations that may be suitable include, but are not limited to, personal computers, server computers, thin clients, thick clients, hand-held or laptop devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
  • [0036]
    As shown in FIG. 1, computing environment 120 includes a general-purpose computing device in the form of a computer 130. The components of computer 130 may include one or more processors or processing units 132, a system memory 134, and a bus 136 that couples various system components including system memory 134 to processor 132.
  • [0037]
    Bus 136 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, and not limitation, such architectures include Industry Standard Architecture (ISA) bus, Micro Channel Architecture (MCA) bus, Enhanced ISA (EISA) bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnects (PCI) bus also known as Mezzanine bus.
  • [0038]
    Computer 130 typically includes a variety of computer readable media. Such media may be any available media that is accessible by computer 130, and it includes both volatile and non-volatile media, removable and non-removable media.
  • [0039]
    In FIG. 1, system memory 134 includes computer readable media in the form of volatile memory, such as random access memory (RAM) 140, and/or non-volatile memory, such as read only memory (ROM) 138. A basic input/output system (BIOS) 142, containing the basic routines that help to transfer information between elements within computer 130, such as during start-up, is stored in ROM 138. RAM 140 typically contains data and/or program modules that are immediately accessible to and/or presently being operated on by processor 132.
  • [0040]
    Computer 130 may further include other removable/non-removable, volatile/non-volatile computer storage media. For example, FIG. 1 illustrates a hard disk drive 144 for reading from and writing to a non-removable, non-volatile magnetic media (not shown and typically called a “hard drive”), a magnetic disk drive 146 for reading from and writing to a removable, non-volatile magnetic disk 148 (e.g., a “floppy disk”), and an optical disk drive 150 for reading from or writing to a removable, non-volatile optical disk 152 such as a CD-ROM/R/RW, DVD-ROM/R/RW/+R/RAM or other optical media. Hard disk drive 144, magnetic disk drive 146 and optical disk drive 150 are each connected to bus 136 by one or more interfaces 154.
  • [0041]
    The drives and associated computer-readable media provide nonvolatile storage of computer readable instructions, data structures, program modules, and other data for computer 130. Although the exemplary environment described herein employs a hard disk, a removable magnetic disk 148 and a removable optical disk 152, it should be appreciated by those skilled in the art that other types of computer readable media which can store data that is accessible by a computer, such as magnetic cassettes, flash memory cards, digital video disks, random access memories (RAMs), read only memories (ROM), and the like, may also be used in the exemplary operating environment.
  • [0042]
    A number of program modules may be stored on the hard disk, magnetic disk 148, optical disk 152, ROM 138, or RAM 140, including, e.g., an operating system 158, one or more application programs 160, other program modules 162, and program data 164.
  • [0043]
    The improved methods and systems described herein may be implemented within operating system 158, one or more application programs 160, other program modules 162, and/or program data 164.
  • [0044]
    A user may provide commands and information into computer 130 through input devices such as keyboard 166 and pointing device 168 (such as a “mouse”). Other input devices (not shown) may include a microphone, joystick, game pad, satellite dish, serial port, scanner, camera, etc. These and other input devices are connected to the processing unit 132 through a user input interface 170 that is coupled to bus 136, but may be connected by other interface and bus structures, such as a parallel port, game port, or a universal serial bus (USB).
  • [0045]
    A monitor 172 or other type of display device is also connected to bus 136 via an interface, such as a video adapter 174. In addition to monitor 172, personal computers typically include other peripheral output devices (not shown), such as speakers and printers, which may be connected through output peripheral interface 175.
  • [0046]
    Computer 130 may operate in a networked environment using logical connections to one or more remote computers, such as a remote computer 182. Remote computer 182 may include many or all of the elements and features described herein relative to computer 130.
  • [0047]
    Logical connections shown in FIG. 1 are a local area network (LAN) 177 and a general wide area network (WAN) 179. Such networking environments are commonplace in offices, enterprise-wide computer networks, intranets, and the Internet.
  • [0048]
    When used in a LAN networking environment, computer 130 is connected to LAN 177 via network interface or adapter 186. When used in a WAN networking environment, the computer typically includes a modem 178 or other means for establishing communications over WAN 179. Modem 178, which may be internal or external, may be connected to system bus 136 via the user input interface 170 or other appropriate mechanism.
  • [0049]
    Depicted in FIG. 1, is a specific implementation of a WAN via the Internet. Here, computer 130 employs modem 178 to establish communications with at least one remote computer 182 via the Internet 180.
  • [0050]
    In a networked environment, program modules depicted relative to computer 130, or portions thereof, may be stored in a remote memory storage device. Thus, e.g., as depicted in FIG. 1, remote application programs 189 may reside on a memory device of remote computer 182. It will be appreciated that the network connections shown and described are exemplary and other means of establishing a communications link between the computers may be used.
  • [0051]
    Attention is now drawn to FIG. 2, which is a block diagram depicting another exemplary device 200 that is also capable of benefiting from the methods and apparatuses disclosed herein. Device 200 is representative of any one or more devices or appliances that are operatively configured to process video and/or any related types of data in accordance with all or part of the methods and apparatuses described herein and their equivalents. Thus, device 200 may take the form of a computing device as in FIG. 1, or some other form, such as, for example, a wireless device, a portable communication device, a personal digital assistant, a video player, a television, a DVD player, a CD player, a karaoke machine, a kiosk, a digital video projector, a flat panel video display mechanism, a set-top box, a video game machine, etc. In this example, device 200 includes logic 202 configured to process video data, a video data source 204 configured to provide vide data to logic 202, and at least one display module 206 capable of displaying at least a portion of the video data for a user to view. Logic 202 is representative of hardware, firmware, software and/or any combination thereof. In certain implementations, for example, logic 202 includes a compressor/decompressor (codec), or the like. Video data source 204 is representative of any mechanism that can provide, communicate, output, and/or at least momentarily store video data suitable for processing by logic 202. Video reproduction source is illustratively shown as being within and/or without device 200. Display module 206 is representative of any mechanism that a user might view directly or indirectly and see the visual results of video data presented thereon. Additionally, in certain implementations, device 200 may also include some form or capability for reproducing or otherwise handling audio data associated with the video data. Thus, an audio reproduction module 208 is shown.
  • [0052]
    With the examples of FIGS. 1 and 2 in mind, and others like them, the next sections focus on certain exemplary methods and apparatuses that may be at least partially practiced using with such environments and with such devices.
  • [0053]
    Encoding Bidirectionally Predictive (B) Pictures And Motion Vector Prediction
  • [0054]
    This section describes several exemplary improvements that can be implemented to encode Bidirectionally Predictive (B) pictures and Motion Vector prediction within a video coding system or the like. The exemplary methods and apparatuses can be applied to predict motion vectors and enhancements in the design of a B picture Direct Mode. Such methods and apparatuses are particularly suitable for multiple picture reference codecs, such as, for example, JVT, and can achieve considerable coding gains especially for panning sequences or scene changes.
  • [0055]
    Bidirectionally Predictive (B) pictures are an important part of most video coding standards and systems since they tend to increase the coding efficiency of such systems, for example, when compared to only using Predictive (P) pictures. This improvement in coding efficiency is mainly achieved by the consideration of bidirectional motion compensation, which can effectively improve motion compensated prediction and thus allow the encoding of significantly reduced residue information. Furthermore, the introduction of the Direct Prediction mode for a Macroblock/block within such pictures can further increase efficiency considerably (e.g., more than 10-20%) since no motion information is encoded. Such may be accomplished, for example, by allowing the prediction of both forward and backward motion information to be derived directly from the motion vectors used in the corresponding macroblock of a subsequent reference picture.
  • [0056]
    By way of example, FIG. 4 illustrates Direct Prediction in B picture at time coding based on P frames at times t and t+2, and the applicable motion vectors (MVs). Here, an assumption is made that an object in the picture is moving with constant speed. This makes it possible to predict a current position inside a B picture without having to transmit any motion vectors. The motion vectors ({right arrow over (MV)}fw,{right arrow over (MV)}bw) of the Direct Mode versus the motion vector {right arrow over (MV)} of the collocated MB in the first subsequent P reference picture are basically calculated by: MV fw = TR B MV TR D and MV bw = ( TR B - TR D ) MV TR D ,
  • [0057]
    where TRB is the temporal distance between the current B picture and the reference picture pointed by the forward MV of the collocated MB, and TRD is the temporal distance between the future reference picture and the reference picture pointed by the forward MV of the collocated MB.
  • [0058]
    Unfortunately there are several cases where the existing Direct Mode does not provide an adequate solution, thus not efficiently exploiting the properties of this mode. In particular, existing designs of this mode usually force the motion parameters of the Direct Macroblock, in the case of the collocated Macroblock in the subsequent P picture being Intra coded, to be zero. For example, see FIG. 6, which illustrates handling of collocated intra within existing codecs wherein motion is assumed to be zero. This essentially means that, for this case, the B picture Macroblock will be coded as the average of the two collocated Macroblocks in the first subsequent and past P references. This immediately raises the following concern; if a Macroblock is Intra-coded, then how does one know how much relationship it has with the collocated Macroblock of its reference picture. In some situations, there may be little if any actual relationship. Hence, it is possible that the coding efficiency of the Direct Mode may be reduced. An extreme case can be seen in the case of a scene change as illustrated in FIG. 5. FIG. 5 illustrates what happens when a scene change occurs in the video sequence and/or what happens when the collocated block is intra. Here, in this example, obviously no relationship exists between the two reference pictures given the scene change. In such a case bidirectional prediction would provide little if any benefit. As such, the Direct Mode could be completely wasted. Unfortunately, conventional implementations of the Direct Mode restrict it to always perform a bidirectional prediction of a Macroblock.
  • [0059]
    [0059]FIG. 7 is an illustrative diagram depicting how Direct Mode is handled when the reference picture of the collocated block in the subsequent P picture is other than zero, in accordance with certain implementations of the present invention.
  • [0060]
    An additional issue with the Direct Mode Macroblocks exists when multi-picture reference motion compensation is used. Until recently, for example, the JVT standard provided the timing distance information (TRB and TRD), thus allowing for the proper scaling of the parameters. Recently, this was changed in the new revision of the codec (see, e.g., Joint Video Team (JVT) of ISO/IEC MPEG and ITU-T VCEG, “Joint Committee Draft (CD) of Joint Video Specification (ITU-T Rec. H.264|ISO/IEC 14496-10 AVC)”, ITU-T JVT-C167, May. 2002, which is incorporated herein by reference). In the new revision, the motion vector parameters of the subsequent P picture are to be scaled equally for the Direct Mode prediction, without taking in account the reference picture information. This could lead to significant performance degradation of the Direct Mode, since the constant motion assumption is no longer followed.
  • [0061]
    Nevertheless, even if the temporal distance parameters were available, it is not always certain that the usage of the Direct Mode as defined previously is the most appropriate solution. In particular for the B pictures which are closer to a first forward reference picture, the correlation might be much stronger with that picture, than the subsequent reference picture. An extreme example which could contain such cases could be a sequence where scene A changes to scene B, and then moves back to scene A (e.g., as may happen in a news bulletin, etc.). All the above could deter the performance of B picture encoding considerably since Direct Mode will not be effectively exploited within the encoding process.
  • [0062]
    With these and other concerns in mind, unlike the previous definitions of the Direct Mode where only temporal prediction was used, in accordance with certain aspects of the present invention, a new Direct Macroblock type is introduced wherein both temporal prediction and/or spatial prediction is considered. The type(s) of prediction used can depend on the type of reference picture information of the first subsequent P reference picture, for example.
  • [0063]
    In accordance with certain other aspects of the present invention, one may also further considerably improve motion vector prediction for both P and B pictures when multiple picture references are used, by taking in consideration temporal distances, if such are available.
  • [0064]
    These enhancements are implemented in certain exemplary methods and apparatuses as described below. The methods and apparatuses can achieve significant bitrate reductions while achieving similar or better quality.
  • [0065]
    Direct Mode Enhancements:
  • [0066]
    In most conventional video coding systems, Direct Mode is designed as a bidirectional prediction scheme where motion parameters are always predicted in a temporal way from the motion parameters in the subsequent P images. In this section, an enhanced Direct Mode technique is provided in which spatial information may also/alternatively be considered for such predictions.
  • [0067]
    One or more of the following exemplary techniques may be implemented as needed, for example, depending on the complexity and/or specifications of the system.
  • [0068]
    One technique is to implement spatial prediction of the motion vector parameters of the Direct Mode without considering temporal prediction. Spatial prediction can be accomplished, for example, using existing Motion Vector prediction techniques used for motion vector encoding (such as, e.g., median prediction). If multiple picture references are used, then the reference picture of the adjacent blocks may also be considered (even though there is no such restriction and the same reference, e.g. 0, could always be used).
  • [0069]
    Motion parameters and reference pictures could be predicted as follows and with reference to FIG. 3, which illustrates spatial predication associated with portions A-E (e.g., macroblocks, slices, etc.) assumed to be available and part of a picture. Here, E is predicted in general from A, B, C as Median (A, B, C). If C is actually outside of the picture then D is used instead. If B, C, and D are outside of picture, then only A is used, where as if A does not exist, such is replaced with (0,0). Those skilled in the art will recognize that spatial prediction may be done at a subblock level as well.
  • [0070]
    In general spatial prediction can be seen as a linear or nonlinear function of all available motion information calculated within a picture or a group of macroblocks/blocks within the same picture.
  • [0071]
    There are various methods available that may be arranged to predict the reference picture for Direct Mode. For example, one method may be to select a minimum reference picture among the predictions. In another method, a median reference picture may be selected. In certain methods, a selection may be made between a minimum reference picture and median reference picture, e.g., if the minimum is zero. In still other implementations, a higher priority could also be given to either vertical or horizontal predictors (A and B) due to their possibly stronger correlation with E.
  • [0072]
    If one of the predictions does not exist (e.g., all surrounding macroblocks are predicted with the same direction FW or BW only or are intra), then the existing one is only used (single direction prediction) or such could be predicted from the one available. For example if forward prediction is available then: MV bw = ( TR B - TR D ) MV fw TR B
  • [0073]
    Temporal prediction is used for Macroblocks if the subsequent P reference is non intra as in existing codecs. Attention is now drawn to FIG. 8, in which MVFW and MVBW are derived from spatial prediction (Median MV of surrounding Macroblocks). If either one is not available (i.e., no predictors) then one-direction is used. If a subsequent P reference is intra, then spatial prediction can be used instead as described above. Assuming that no restrictions exist, if one of the predictions is not available then Direct Mode becomes a single direction prediction mode.
  • [0074]
    This could considerably benefit video coding when the scene changes, for example, as illustrated in FIG. 9, and/or even when fading exists within a video sequence. As illustrated in FIG. 9, spatial prediction may be used to solve the problem of a scene change.
  • [0075]
    If temporal distance information is not available within a codec, temporal prediction will not be as efficient in the direct mode for blocks when the collocated P reference block has a non-zero reference picture. In such a case, spatial prediction may also be used as above. As an alternative, one may estimate scaling parameters if one of the surrounding macroblocks also uses the same reference picture as the collocated P reference block. Furthermore, special handling may be provided for the case of zero motion (or close to zero motion) with a non-zero reference. Here, regardless of temporal distance forward and backward motion vectors could always be taken as zero. The best solution, however, may be to always examine the reference picture information of surrounding macroblocks and based thereon decide on how the direct mode should be handled in such a case.
  • [0076]
    More particularly, for example, given a non-zero reference, the following sub cases may be considered:
  • [0077]
    Case A: Temporal prediction is used if the motion vectors of the collocated P block are zero.
  • [0078]
    Case B: If all surrounding macroblocks use different reference pictures than the collocated P reference, then spatial prediction appears to be a better choice and temporal prediction is not used.
  • [0079]
    Case C: If motion flow inside the B picture appears to be quite different than the one in the P reference picture, then spatial prediction is used instead.
  • [0080]
    Case D: Spatial or temporal prediction of Direct Mode macroblocks could be signaled inside the image header. A pre-analysis of the image could be performed to decide which should be used.
  • [0081]
    Case E: Correction of the temporally predicted parameters based on spatial information (or vice versa). Thus, for example, if both appear to have the same or approximately the same phase information then the spatial information could be a very good candidate for the direct mode prediction. A correction could also be done on the phase, thus correcting the sub pixel accuracy of the prediction.
  • [0082]
    [0082]FIG. 10 illustrates a joint spatio-temporal prediction for Direct Mode in B picture coding. Here, in this example, Direct Mode can be a 1- to 4-direction mode depending on information available. Instead of using Bi-directional prediction for Direct Mode macroblocks, a multi-hypothesis extension of such mode can be done and multiple predictions used instead.
  • [0083]
    Combined with the discussion above, Direct Mode macroblocks can be predicted using from one up to four possible motion vectors depending on the information available. Such can be decided, for example, based on the mode of the collocated P reference image macroblock and on the surrounding macroblocks in the current B picture. In such a case, if the spatial prediction is too different than the temporal one, one of them could be selected as the only prediction in favor of the other. Since spatial prediction as described previously, might favor a different reference picture than the temporal one, the same macroblock might be predicted from more than 2 reference pictures.
  • [0084]
    The JVT standard does not restrict the first future reference to be a P picture. Hence, in such a standard, a picture can be a B as illustrated in FIG. 12, or even a Multi-Hypothesis (MH) picture. This implies that more motion vectors are assigned per macroblock. This means that one may also use this property to increase the efficiency of the Direct Mode by more effectively exploiting the additional motion information.
  • [0085]
    In FIG. 12, the first subsequent reference picture is a B picture (pictures B8 and B9). This enables one to use more candidates for Direct Mode prediction especially if bidirectional prediction is used within the B picture.
  • [0086]
    In particular one may perform the following:
  • [0087]
    a.) If the collocated reference block in the first future reference is using bidirectional prediction, the corresponding motion vectors (forward or backward) are used for calculating the motion vectors of the current block. Since the backward motion vector of the reference corresponds to a future reference picture, special care should be taken in the estimate of the current motion parameters. Attention is drawn, for example to FIG. 12 in which the first subsequent reference picture is a B picture (pictures B8 and B9). This enables one to use more candidates for Direct Mode prediction especially if bidirectional prediction is used within the B picture. Thus, as illustrated, the backward motion vector of B8 {right arrow over (MV)}B8bw can be calculated as 2{right arrow over (MV)}B7bw due to the temporal distance between B8, B7 and P6 Similarly for B9 the backward motion vector can be taken as {right arrow over (MV)}B7bw, if though these refer to the B7. One may also restrict these to refer to the first subsequent P picture, in which case these motion vectors can be scaled accordingly. A similar conclusion can be deduced about the forward motion vectors. Multiple picture reference or intra macroblocks can be handled similar to the previous discussion.
  • [0088]
    b.) If bidirectional prediction for the collocated block is used, then, in this example, one may estimate four possible predictions for one macroblock for the direct mode case by projecting and inverting the backward and forward motion vectors of the reference.
  • [0089]
    c.) Selective projection and inversion may be used depending on temporal distance. According to this solution, one selects the motion vectors from the reference picture which are more reliable for the prediction. For example, considering the illustration in FIG. 12, one will note that B8 is much closer to P2 than P6. This implies that the backward motion vector of B7 may not be a very reliable prediction. In this case, direct mode motion vectors can therefore be calculated only from the forward prediction of B7. For B9, however, both motion vectors seem to be adequate enough for the prediction and therefore may be used. Such decisions/information may also be decided/supported within the header of the image. Other conditions and rules may also be implemented. For example, additional spatial confidence of a prediction and/or a motion vector phase may be considered. Note, in particular, that if the forward and backward motion vectors have no relationship, then the backward motion vector might be too unreliable to use.
  • [0090]
    Single Picture Reference for B Pictures:
  • [0091]
    A special case exists with the usage of only one picture reference for B pictures (although, typically a forward and a backward reference are necessary) regardless of how many reference pictures are used in P pictures. From observations of encoding sequences in the current JVT codec, for example, it was noted that, if one compares the single-picture reference versus the multi-picture reference case using B pictures, even though encoding performance of P pictures for the multi-picture case is almost always superior to that of the single-picture, the some is not always true for B pictures.
  • [0092]
    One reason for this observation is the overhead of the reference picture used for each macroblock. Considering that B pictures rely more on motion information than P pictures, the reference picture information overhead reduces the number of bits that are transmitted for the residue information at a given bitrate, which thereby reduces efficiency. A rather easy and efficient solution could be the selection of only one picture reference for either backward or forward motion compensation, thus not needing to transmit any reference picture information.
  • [0093]
    This is considered with reference to FIGS. 13 and 14. As illustrated in FIG. 13, B pictures can be restricted in using only one future and past reference pictures. Thus, for direct mode motion vector calculation, projection of the motion vectors is necessary. A projection of the collocated MVs to the current reference for temporal direct prediction is illustrated in FIG. 14 (note that it is possible that TDD,0>TDD,1). Thus, in this example, Direct Mode motion parameters are calculated by projecting motion vectors that refer to other reference pictures to the two reference pictures, or by using spatial prediction as in FIG. 13. Note that such options not only allow for possible reduced encoding complexity of B pictures, but also tend to reduce memory requirements since fewer B pictures (e.g., maximum two) are needed to be stored if B pictures are allowed to reference B pictures. In certain cases a reference picture of the first future reference picture may no longer be available in the reference buffer. This could immediately generate a problem for the estimate of Direct Mode macroblocks and special handling of such cases is required. Obviously there is no such problem if a single picture reference is used. However, if multiple picture references are desired, then possible solutions include projecting the motion vector(s) to either the first forward reference picture, and/or to the reference picture that was closest to the non available picture. Either solution could be viable, whereas again spatial prediction could be an alternative solution.
  • [0094]
    Refinements of the motion vector prediction for single- and multi-picture reference motion compensation
  • [0095]
    Motion vector prediction for multi-picture reference motion compensation can significantly affect the performance of both B and P picture coding. Existing standards, such as, for example, JVT, do not always consider the reference pictures of the macroblocks used in the prediction. The only consideration such standards do make is when only one of the prediction macroblocks uses the same reference. In such a case, only that predictor is used for the motion prediction. There is no consideration of the reference picture if only one or all predictors are using a different reference.
  • [0096]
    In such a case, for example, and in accordance with certain further aspects of the present invention, one can scale the predictors according to their temporal distance versus the current reference. Attention is drawn to FIG. 11, which illustrates Motion Vector prediction of a current block (C) considering the reference picture information of predictor macroblocks (Pr) and performance of proper adjustments (e.g., scaling of the predictors).
  • [0097]
    If predictors A, B, and C use reference pictures with temporal distance TRA, TRB, and TRC respectively, and the current reference picture has a temporal distance equal to TR, then the median predictor is calculated as follows: MV pred = TR Median ( MV A TR A , MV B TR B , MV C TR C )
  • [0098]
    If integer computation is to be used, it may be easier to place the multiplication inside the median, thus increasing accuracy. The division could also be replaced with shifting, but that reduces the performance, whereas it might be necessary to handle signed shifting as well (−1>>N=−1). It is thus very important in such cases to have the temporal distance information available for performing the appropriate scaling. Such could also be available within the header, if not predictable otherwise.
  • [0099]
    Motion Vector prediction as discussed previously is basically median biased, meaning that the median value among a set of predictors is selected for the prediction. If one only uses one type of macroblock (e.g., 1616) with one Motion Vector (MV), then these predictors can be defined, for example, as illustrated in FIG. 15. Here, MV predictors are shown for one MV. In FIG. 15a, the MB is not in the first row or the last column. In FIG. 15b, the MB is in the last column. In FIG. 15c, the MB is in the first row.
  • [0100]
    The JVT standard improves on this further by also considering the case that only one of the three predictors exists (i.e. Macroblocks are intra or are using a different reference picture in the case of multi-picture prediction). In such a case, only the existing or same reference predictor is used for the prediction and all others are not examined.
  • [0101]
    Intra coding does not always imply that a new object has appeared or that scene changes. It might instead, for example, be the case that motion estimation and compensation is inadequate to represent the current object (e.g., search range, motion estimation algorithm used, quantization of residue, etc) and that better results could be achieved through Intra Coding instead. The available motion predictors could still be adequate enough to provide a good motion vector predictor solution.
  • [0102]
    What is intriguing is the consideration of subblocks within a Macroblock, with each one being assigned different motion information. MPEG-4 and H.263 standards, for example, can have up to four such subblocks (e.g., with size 88), where as the JVT standard allows up to sixteen subblocks while also being able to handle variable block sizes (e.g., 44, 48, 84,88, 816, 168, and 1616). In addition JVT also allows for 88 Intra subblocks, thus complicating things even further.
  • [0103]
    Considering the common cases of JVT and MPEG-4/H.263 (88 and 1616), the predictor set for a 1616 macroblock is illustrated in FIGS. 16a-c having a similar arrangement to FIGS. 15a-c, respectively. Here, Motion Vector predictors are shown for one MV with 88 partitions. Even though the described predictors could give reasonable results in some cases, it appears that they may not adequately cover all possible predictions.
  • [0104]
    Attention is drawn next to FIGS. 17a-c, which are also in a similar arrangement to FIGS. 15a-c, respectively. Here, in FIGS. 17a-c there are two additional predictors that could also be considered in the prediction phase (C1 and A2). If 44 blocks are also considered, this increases the possible predictors by four.
  • [0105]
    Instead of employing a median of the three predictors A, B, and C (or Al, B, and C2) one may now have some additional, and apparently more reliable, options. Thus, for example, one can observe that predictors A1, and C2 are essentially too close with one another and it may be the case that they may not be too representative in the prediction phase. Instead, selecting predictors A1, C1, and B seems to be a more reliable solution due to their separation. An alternative could also be the selection of A2 instead of A1 but that may again be too close to predictor B. Simulations suggest that the first case is usually a better choice. For the last column A2 could be used instead of Al. For the first row either one of A1 and A2 or even their average value could be used. Gain up to 1% was noted within JVT with this implementation.
  • [0106]
    The previous case adds some tests for the last column. By examining FIG. 17b, for example, it is obvious that such tends to provide the best partitioning available. Thus, an optional solution could be the selection of A2, C1, and B (from the upper-left position). This may not always be recommended however, since such an implementation may adversely affect the performance of right predictors.
  • [0107]
    An alternative solution would be the usage of averages of predictors within a Macroblock. The median may then be performed as follows:
  • {right arrow over (MV)} pred=Median(Ave({right arrow over (MV)} C 1 ,{right arrow over (MV)} C 2 ), Ave({right arrow over (MV)} A 1 ,{right arrow over (MV)} A 2 ),{right arrow over (MV)} B).
  • [0108]
    For median row/column calculation, the median can be calculated as:
  • {right arrow over (MV)} pred=(Median(Median({right arrow over (MV)} C 1 ,{right arrow over (MV)} C 2 ,{right arrow over (MV)} D), . . . Median({right arrow over (MV)} D ,{right arrow over (MV)} A 1 ,{right arrow over (MV)} C 2 ),Median({right arrow over (MV)} B ,{right arrow over (MV)} A 1 ,{right arrow over (MV)}A 2 ))
  • [0109]
    Another possible solution is a Median5 solution. This is probably the most complicated solution due to computation (quick-sort or bubble-sort could for example be used), but could potentially yield the best results. If 44 blocks are considered, for example, then Median9 could also be used:
  • {right arrow over (MV)} pred=Median({right arrow over (MV)} C 1 , {right arrow over (MV)} C 2 ,{right arrow over (MV)} D,{right arrow over (MV)}B ,{right arrow over (MV)} A 1 ,{right arrow over (MV)} A 2 )
  • [0110]
    Considering that JVT allows the existence of Intra subblocks within an Inter Macroblock (e.g., tree macroblock structure), such could also be taken in consideration within the Motion Prediction. If a subblock (e.g., from Macroblocks above or left only) to be used for the MV prediction is Intra, then the adjacent subblock may be used instead. Thus, if Al is intra but A2 is not, then Al can be replaced by A2 in the prediction. A further possibility is to replace one missing Intra Macroblock with the MV predictor from the upper-left position. In FIG. 17a, for example, if C1 is missing then D may be used instead.
  • [0111]
    In the above sections, several improvements on B picture Direct Mode and on Motion Vector Prediction were presented. It was illustrated that spatial prediction can also be used for Direct Mode macroblocks; where as Motion Vector prediction should consider temporal distance and subblock information for more accurate prediction. Such considerations should significantly improve the performance of any applicable video coding system.
  • [0112]
    Conclusion
  • [0113]
    Although the description above uses language that is specific to structural features and/or methodological acts, it is to be understood that the invention defined in the appended claims is not limited to the specific features or acts described. Rather, the specific features and acts are disclosed as exemplary forms of implementing the invention.
Patent Citations
Cited PatentFiling datePublication dateApplicantTitle
US5132792 *Oct 12, 1990Jul 21, 1992Sony CorporationVideo signal transmitting system
US5260782 *Aug 31, 1992Nov 9, 1993Matsushita Electric Industrial Co., Ltd.Adaptive DCT/DPCM video signal coding method
US6005980 *Jul 21, 1997Dec 21, 1999General Instrument CorporationMotion estimation and compensation of video object planes for interlaced digital video
US6636565 *Jan 12, 2000Oct 21, 2003Lg Electronics Inc.Method for concealing error
US20010040926 *May 15, 2001Nov 15, 2001Miska HannukselaVideo coding
US20040146109 *Apr 16, 2003Jul 29, 2004Satoshi KondoMethod for calculation motion vector
Referenced by
Citing PatentFiling datePublication dateApplicantTitle
US7233621 *Jan 6, 2003Jun 19, 2007Lg Electronics, Inc.Method of determining a motion vector for deriving motion vectors of bi-predictive block
US7558321 *Jul 7, 2009Lg Electronics Inc.Method of deriving a motion vector of a bi-predictive block based on a list 1 motion vector of a co-located block in a reference picture
US7570691 *Jan 28, 2005Aug 4, 2009Lg Electronics Inc.Method of determining a motion vector for deriving motion vectors of a bi-predictive image block
US7606307 *Oct 20, 2009Lg Electronics Inc.Method of determining a motion vector for deriving motion vectors of a bi-predictive image block
US7627035 *Dec 1, 2009Lg Electronics Inc.Method of determining a motion vector for deriving motion vectors of a bi-predictive image block
US7634007 *Dec 15, 2009Lg Electronics Inc.Method of deriving a motion vector of a bi-predictive image block based on a list 1 motion vector of a co-located block using a bit operation
US7643556 *Jan 5, 2010Lg Electronics Inc.Method of deriving a motion vector of a bi-predictive block based on scaling a motion vector of a co-located block in a reference picture
US7643557 *Jan 5, 2010Lg Electronics Inc.Method of deriving a motion vector of a bi-predictive image block based on a list 0 motion vector of a co-located block using a bit operation
US7646810Dec 9, 2005Jan 12, 2010Microsoft CorporationVideo coding
US7664177Sep 15, 2004Feb 16, 2010Microsoft CorporationIntra-coded fields for bi-directional frames
US7680185Sep 15, 2004Mar 16, 2010Microsoft CorporationSelf-referencing bi-directionally predicted frames
US7751478Jul 6, 2010Seiko Epson CorporationPrediction intra-mode selection in an encoder
US7830961Nov 9, 2010Seiko Epson CorporationMotion estimation and inter-mode prediction
US7843995Dec 19, 2005Nov 30, 2010Seiko Epson CorporationTemporal and spatial analysis of a video macroblock
US7852936Sep 15, 2004Dec 14, 2010Microsoft CorporationMotion vector prediction in bi-directionally predicted interlaced field-coded pictures
US7876828Jul 29, 2005Jan 25, 2011Sejong Industry-Academy Cooperation FoundationMethod, medium, and apparatus predicting direct mode motion of a multi-angle moving picture
US7936821May 3, 2011Hitachi Consumer Electronics Co., Ltd.Moving picture encoding method and decoding method
US7936822May 3, 2011Hitachi Consumer Electronics Co., Ltd.Moving picture encoding method and decoding method
US7936823Oct 31, 2007May 3, 2011Hitach Consumer Electronics Co., Ltd.Moving picture encoding method and decoding method
US7970050Jun 28, 2011Samsung Electronics Co., Ltd.Method and apparatus for encoding and decoding data in intra mode based on multiple scanning
US7970058Jul 11, 2003Jun 28, 2011Hitachi Consumer Electronics Co., Ltd.Moving picture encoding method and decoding method
US7978919Sep 7, 2005Jul 12, 2011Samsung Electronics Co., Ltd.Method and apparatus for encoding and decoding in inter mode based on multiple scanning
US8009732Aug 30, 2011Seiko Epson CorporationIn-loop noise reduction within an encoder framework
US8019001 *Feb 28, 2006Sep 13, 2011Samsung Electronics Co., Ltd.Prediction image generating method and apparatus using single coding mode for all color components, and image and video encoding/decoding method and apparatus using the same
US8036272 *Oct 31, 2007Oct 11, 2011Hitachi Consumer Electronics Co., Ltd.Moving picture encoding method and decoding method
US8064520Jun 29, 2004Nov 22, 2011Microsoft CorporationAdvanced bi-directional predictive coding of interlaced video
US8144776Jul 27, 2006Mar 27, 2012Fujitsu LimitedDirect mode video coding using variable selection criterion
US8155193 *May 7, 2004Apr 10, 2012Ntt Docomo, Inc.Moving picture encoding apparatus, moving picture decoding apparatus, moving picture encoding method, moving picture decoding method, moving picture encoding program, and moving picture decoding program
US8170102May 1, 2012Seiko Epson CorporationMacroblock homogeneity analysis and inter mode prediction
US8189666Feb 2, 2009May 29, 2012Microsoft CorporationLocal picture identifier and computation of co-located information
US8229233Jul 24, 2012Samsung Electronics Co., Ltd.Method and apparatus for estimating and compensating spatiotemporal motion of image
US8254455Jun 30, 2007Aug 28, 2012Microsoft CorporationComputing collocated macroblock information for direct mode macroblocks
US8320459Jun 3, 2011Nov 27, 2012Hitachi Consumer Electronics Co., Ltd.Moving picture encoding method and decoding method
US8325816Aug 11, 2011Dec 4, 2012Hitachi Consumer Electronics Co., Ltd.Moving picture encoding method and decoding method
US8340190Dec 25, 2012Hitachi Consumer Electronics Co., Ltd.Moving picture encoding method and decoding method
US8345757Jan 1, 2013Lg Electronics Inc.Method of deriving a motion vector of a bi-predictive block based on a list o motion vector of a co-located block in a reference picture
US8345758Oct 8, 2008Jan 1, 2013Lg Electronics Inc.Method of deriving a motion vector of a bi-predictive block based on a motion vector of a co-located block in a reference picture
US8351503Oct 9, 2007Jan 8, 2013Lg Electronics Inc.Method of deriving a motion vector of a bi-predictive block based on temporal distances associated with a co-located block in a reference picture
US8351504 *Jan 8, 2013Lg Electronics Inc.Method of deriving a motion vector of a bi-predictive image block by applying a bit operation
US8351505Jan 8, 2013Lg Electronics Inc.Method of deriving a motion vector of a bi-predictive block based on a motion vector of a co-located block in a reference picture
US8351506 *Oct 10, 2008Jan 8, 2013Lg Electronics Inc.Method of deriving a motion vector of a bi-predictive block based on a motion vector of a co-located block in a reference picture
US8351507 *Jan 8, 2013Lg Electronics Inc.Method of deriving a motion vector of a bi-predictive block based on a motion vector of a co-located block in a reference picture
US8374245Sep 20, 2006Feb 12, 2013Microsoft CorporationSpatiotemporal prediction for bidirectionally predictive(B) pictures and motion vector prediction for multi-picture reference motion compensation
US8379722Feb 19, 2013Microsoft CorporationTimestamp-independent motion vector prediction for predictive (P) and bidirectionally predictive (B) pictures
US8385417 *Feb 26, 2013Lg Electronics, Inc.Method of deriving a motion vector of a bi-predictive block based on a motion vector of a co-located block in a reference picture
US8396128 *Oct 10, 2008Mar 12, 2013Lg Electronics Inc.Method of deriving a motion vector of a bi-predictive block based on a motion vector of a co-located block in a reference picture
US8406300Mar 26, 2013Microsoft CorporationVideo coding
US8411748 *Apr 2, 2013Lg Electronics Inc.Method of deriving a motion vector of a bi-predictive block based on a motion vector of a co-located block in a reference picture
US8416853 *Oct 9, 2008Apr 9, 2013Lg Electronics Inc.Method of deriving a motion vector of a bi-predictive block based on a motion vector of a co-located block in a reference picture
US8416854 *Oct 10, 2008Apr 9, 2013Lg Electronics Inc.Method of deriving a motion vector of a bi-predictive block based on a motion vector of a co-located block in a reference picture
US8422556 *Oct 8, 2008Apr 16, 2013Lg Electronics Inc.Method of deriving a motion vector of a bi-predictive block based on a motion vector of a co-located block in a reference picture
US8428134 *Oct 9, 2008Apr 23, 2013Lg Electronics Inc.Method of deriving a motion vector of a bi-predictive block based on a motion vector of a co-located block in a reference picture
US8432969 *Oct 8, 2008Apr 30, 2013Lg Electronics Inc.Method of deriving a motion vector of a bi-predictive block based on a motion vector of a co-located block in a reference picture
US8571107Aug 11, 2011Oct 29, 2013Hitachi Consumer Electronics Co., Ltd.Moving picture encoding method and decoding method
US8571108Aug 11, 2011Oct 29, 2013Hitachi Consumer Electronics Co., Ltd.Moving picture encoding method and decoding method
US8638853Feb 4, 2010Jan 28, 2014Microsoft CorporationVideo coding
US8638856Mar 26, 2013Jan 28, 2014Lg Electronics Inc.Method to derive at least one motion vector of a bi-predictive block in a current picture
US8687693Nov 21, 2008Apr 1, 2014Dolby Laboratories Licensing CorporationTemporal image prediction
US8774280Jan 29, 2013Jul 8, 2014Microsoft CorporationTimestamp-independent motion vector prediction for predictive (P) and bidirectionally predictive (B) pictures
US8837594Jul 18, 2012Sep 16, 2014Hitachi Consumer Electronics Co., Ltd.Moving picture encoding method and decoding method considering motion vectors of blocks adjacent to target block
US8873630Feb 6, 2013Oct 28, 2014Microsoft CorporationSpatiotemporal prediction for bidirectionally predictive (B) pictures and motion vector prediction for multi-picture reference motion compensation
US8891623 *Dec 11, 2013Nov 18, 2014Lg Electronics Inc.Method to derive at least one motion vector of a bi-predictive block in a current picture
US8891624 *Dec 11, 2013Nov 18, 2014Lg Electronics Inc.Method to derive at least one motion vector of a bi-predictive block in a current picture
US8964846Mar 5, 2012Feb 24, 2015Ntt Docomo, Inc.Moving picture encoding apparatus, moving picture decoding apparatus, moving picture encoding method, moving picture decoding method, moving picture encoding program, and moving picture decoding program
US9042451 *Dec 11, 2013May 26, 2015Lg Electronics Inc.Method to derive at least one motion vector of a bi-predictive block in a current picture
US9042452 *Dec 11, 2013May 26, 2015Lg Electronics Inc.Method to derive at least one motion vector of a bi-predictive block in a current picture
US9042453 *Dec 11, 2013May 26, 2015Lg Electronics Inc.Method to derive at least one motion vector of a bi-predictive block in a current picture
US9060176 *Sep 24, 2010Jun 16, 2015Ntt Docomo, Inc.Motion vector prediction in video coding
US9066106 *Dec 11, 2013Jun 23, 2015Lg Electronics Inc.Method to derive at least one motion vector of a bi-predictive block in a current picture
US9094689 *Jun 28, 2012Jul 28, 2015Google Technology Holdings LLCMotion vector prediction design simplification
US9185427Sep 30, 2014Nov 10, 2015Microsoft Technology Licensing, LlcSpatiotemporal prediction for bidirectionally predictive (B) pictures and motion vector prediction for multi-picture reference motion compensation
US9185428Nov 2, 2012Nov 10, 2015Google Technology Holdings LLCMotion vector scaling for non-uniform motion vector grid
US9225993Dec 22, 2014Dec 29, 2015Ntt Docomo, Inc.Moving picture encoding apparatus, moving picture decoding apparatus, moving picture encoding method, moving picture decoding method, moving picture encoding program, and moving picture decoding program
US9253498Sep 6, 2012Feb 2, 2016Kt CorporationMethod for inducing a merge candidate block and device using same
US9253499Jan 26, 2015Feb 2, 2016Kt CorporationMethod for inducing a merge candidate block and device using same
US9300978 *Mar 19, 2009Mar 29, 2016Nokia Technologies OyCombined motion vector and reference index prediction for video coding
US9319716Jan 17, 2012Apr 19, 2016Qualcomm IncorporatedPerforming motion vector prediction for video coding
US9357225Dec 15, 2015May 31, 2016Kt CorporationMethod for inducing a merge candidate block and device using same
US9426490 *Oct 6, 2003Aug 23, 2016Godo Kaisha Ip Bridge 1Moving picture encoding method and motion picture decoding method
US9445120Feb 3, 2015Sep 13, 2016Sun Patent TrustMoving picture coding method, moving picture coding apparatus, moving picture decoding method, moving picture decoding apparatus and moving picture coding and decoding apparatus
US20040008784 *Jun 13, 2003Jan 15, 2004Yoshihiro KikuchiVideo encoding/decoding method and apparatus
US20040008899 *Jun 13, 2003Jan 15, 2004Alexandros TourapisOptimization techniques for data compression
US20040066848 *Jan 6, 2003Apr 8, 2004Lg Electronics Inc.Direct mode motion vector calculation method for B picture
US20040223548 *May 7, 2004Nov 11, 2004Ntt Docomo, Inc.Moving picture encoding apparatus, moving picture decoding apparatus, moving picture encoding method, moving picture decoding method, moving picture encoding program, and moving picture decoding program
US20050013365 *Jul 18, 2003Jan 20, 2005Microsoft CorporationAdvanced bi-directional predictive coding of video frames
US20050053146 *Sep 15, 2004Mar 10, 2005Microsoft CorporationPrediction mode switching in macroblocks of bi-directionally predicted interlaced frame-coded pictures
US20050053147 *Sep 15, 2004Mar 10, 2005Microsoft CorporationMotion vector prediction in bi-directionally predicted interlaced field-coded pictures
US20050053148 *Sep 15, 2004Mar 10, 2005Microsoft CorporationIntra-coded fields for Bi-directional frames
US20050053149 *Sep 15, 2004Mar 10, 2005Microsoft CorporationDirect mode motion vectors for Bi-directionally predicted interlaced pictures
US20050053297 *Sep 15, 2004Mar 10, 2005Microsoft CorporationSelf-referencing bi-directionally predicted frames
US20050053298 *Sep 15, 2004Mar 10, 2005Microsoft CorporationFour motion vector coding and decoding in bi-directionally predicted interlaced pictures
US20050053300 *Sep 15, 2004Mar 10, 2005Microsoft CorporationBitplane coding of prediction mode information in bi-directionally predicted interlaced pictures
US20050129118 *Jan 28, 2005Jun 16, 2005Jeon Byeong M.Method of determining a motion vector for deriving motion vectors of a bi-predictive image block
US20050129119 *Jan 28, 2005Jun 16, 2005Jeon Byeong M.Method of determining a motion vector for deriving motion vectors of a bi-predictive image block
US20050129120 *Jan 28, 2005Jun 16, 2005Jeon Byeong M.Method of determining a motion vector for deriving motion vectors of a bi-predictive image block
US20050141612 *Oct 6, 2003Jun 30, 2005Kiyofumi AbeMoving picture encoding method and motion picture decoding method
US20050152452 *Jul 11, 2003Jul 14, 2005Yoshinori SuzukiMoving picture encoding method and decoding method
US20060029137 *Jul 29, 2005Feb 9, 2006Daeyang FoundationMethod, medium, and apparatus predicting direct mode motion of a multi-angle moving picture
US20060067400 *Sep 7, 2005Mar 30, 2006Samsung Electronics Co., Ltd.Method and apparatus for encoding and decoding in inter mode based on multiple scanning
US20060072662 *Dec 9, 2005Apr 6, 2006Microsoft CorporationImproved Video Coding
US20060165170 *Jan 21, 2005Jul 27, 2006Changick KimPrediction intra-mode selection in an encoder
US20060203913 *Feb 28, 2006Sep 14, 2006Samsung Electronics Co., Ltd.Prediction image generating method and apparatus using single coding mode for all color components, and image and video encoding/decoding method and apparatus using the same
US20060280253 *Aug 21, 2006Dec 14, 2006Microsoft CorporationTimestamp-Independent Motion Vector Prediction for Predictive (P) and Bidirectionally Predictive (B) Pictures
US20060285594 *Jun 21, 2005Dec 21, 2006Changick KimMotion estimation and inter-mode prediction
US20070014358 *Sep 20, 2006Jan 18, 2007Microsoft CorporationSpatiotemporal prediction for bidirectionally predictive(B) pictures and motion vector prediction for multi-picture reference motion compensation
US20070064809 *Sep 14, 2006Mar 22, 2007Tsuyoshi WatanabeCoding method for coding moving images
US20070140338 *Dec 19, 2005Jun 21, 2007Vasudev BhaskaranMacroblock homogeneity analysis and inter mode prediction
US20070140352 *Dec 19, 2005Jun 21, 2007Vasudev BhaskaranTemporal and spatial analysis of a video macroblock
US20070171977 *Jan 24, 2007Jul 26, 2007Shintaro KudoMoving picture coding method and moving picture coding device
US20070217510 *Jul 27, 2006Sep 20, 2007Fujitsu LimitedVideo coding method, video coding apparatus and video coding program
US20080031332 *Oct 9, 2007Feb 7, 2008Jeon Byeong MMethod of deriving a motion vector of a bi-predictive block based on a list o motion vector of a co-located block in a reference picture
US20080031341 *Oct 9, 2007Feb 7, 2008Jeon Byeong MMethod of deriving a motion vector of a bi-predictive block based on a list 1 motion vector of a co-located block in a reference picture
US20080031342 *Oct 9, 2007Feb 7, 2008Jeon Byeong MMethod of deriving a motion vector of a bi-predictive block based on scaling a motion vector of a co-located block in a reference picture
US20080031343 *Oct 9, 2007Feb 7, 2008Jeon Byeong MMethod of deriving a motion vector of a bi-predictive image block based on a list o motion vector of a co-located block using a bit operation
US20080037639 *Oct 9, 2007Feb 14, 2008Jeon Byeong MMethod of deriving a motion vector of a bi-predictive block based on temporal distances associated with a co-located block in a reference picture
US20080037640 *Oct 9, 2007Feb 14, 2008Jeon Byeong MMethod of deriving a motion vector of a bi-predictive image block by applying a bit operation
US20080037644 *Oct 9, 2007Feb 14, 2008Jeon Byeong MMethod of deriving a motion vector of a bi-predictive image block based on a list 1 motion vector of a co-located block using a bit operation
US20080056366 *Sep 1, 2006Mar 6, 2008Vasudev BhaskaranIn-Loop Noise Reduction Within an Encoder Framework
US20080063071 *Oct 31, 2007Mar 13, 2008Yoshinori SuzukiMoving picture encoding method and decoding method
US20080063072 *Oct 31, 2007Mar 13, 2008Yoshinori SuzukiMoving picture encoding method and decoding method
US20080069225 *Oct 31, 2007Mar 20, 2008Yoshinori SuzukiMoving picture encoding method and decoding method
US20080069235 *Oct 31, 2007Mar 20, 2008Kiyofumi AbeMoving picture coding method and moving picture decoding method
US20080075171 *Oct 31, 2007Mar 27, 2008Yoshinori SuzukiMoving picture encoding method and decoding method
US20090003446 *Jun 30, 2007Jan 1, 2009Microsoft CorporationComputing collocated macroblock information for direct mode macroblocks
US20090067497 *Oct 8, 2008Mar 12, 2009Byeong Moon JeonMethod of deriving a motion vector of a bi-predictive block based on a motion vector of a co-located block in a reference picture
US20090067498 *Oct 8, 2008Mar 12, 2009Byeong Moon JeonMethod of deriving a motion vector of a bi-predictive block based on a motion vector of a co-located block in a reference picture
US20090067499 *Oct 9, 2008Mar 12, 2009Byeong Moon JeonMethod of deriving a motion vector of a bi-predictive block based on a motion vector of a co-located block in a reference picture
US20090067500 *Oct 10, 2008Mar 12, 2009Byeong Moon JeonMethod of deriving a motion vector of a bi-predictive block based on a motion vector of a co-located block in a reference picture
US20090074062 *Oct 8, 2008Mar 19, 2009Byeong Moon JeonMethod of deriving a motion vector of a bi-predictive block based on a motion vector of a co-located block in a reference picture
US20090074063 *Oct 8, 2008Mar 19, 2009Byeong Moon JeonMethod of deriving a motion vector of a bi-predictive block based on a motion vector of a co-located block in a reference picture
US20090074064 *Oct 9, 2008Mar 19, 2009Byeong Moon JeonMethod of deriving a motion vector of a bi-predictive block based on a motion vector of a co-located block in a reference picture
US20090074065 *Oct 9, 2008Mar 19, 2009Byeong Moon JeonMethod of deriving a motion vector of a bi-predictive block based on a motion vector of a co-located block in a reference picture
US20090074066 *Oct 9, 2008Mar 19, 2009Byeong Moon JeonMethod of deriving a motion vector of a bi-predictive block based on a motion vector of a co-located block in a reference picture
US20090074067 *Oct 10, 2008Mar 19, 2009Byeong Moon JeonMethod of deriving a motion vector of a bi-predictive block based on a motion vector of a co-located block in a reference picture
US20090074068 *Oct 10, 2008Mar 19, 2009Byeong Moon JeonMethod of deriving a motion vector of a bi-predictive block based on a motion vector of a co-located block in a reference picture
US20090074069 *Oct 10, 2008Mar 19, 2009Byeong Moon JeonMethod of deriving a motion vector of a bi-predictive block based on a motion vector of a co-located block in a reference picture
US20090245373 *May 29, 2009Oct 1, 2009Microsoft CorporationVideo coding
US20090304084 *Mar 19, 2009Dec 10, 2009Nokia CorporationCombined motion vector and reference index prediction for video coding
US20090316786 *Apr 10, 2007Dec 24, 2009Nxp B.V.Motion estimation at image borders
US20100135390 *Feb 4, 2010Jun 3, 2010Microsoft CorporationVideo coding
US20110080954 *Sep 24, 2010Apr 7, 2011Bossen Frank JMotion vector prediction in video coding
US20110261882 *Apr 7, 2009Oct 27, 2011Thomson LicensingMethods and apparatus for template matching prediction (tmp) in video encoding and decoding
US20130003851 *Jan 3, 2013General Instrument CorporationMotion vector prediction design simplification
US20130070855 *Mar 21, 2013Qualcomm IncorporatedHybrid motion vector coding modes for video coding
US20130188713 *Feb 28, 2013Jul 25, 2013Sungkyunkwan University Foundation For Corporate CollaborationBi-prediction coding method and apparatus, bi-prediction decoding method and apparatus, and recording medium
US20140016702 *Jun 24, 2013Jan 16, 2014Panasonic CorporationImage decoding method and image decoding apparatus
US20140098870 *Dec 11, 2013Apr 10, 2014Lg Electronics Inc.Method to derive at least one motion vector of a bi-predictive block in a current picture
US20140098871 *Dec 11, 2013Apr 10, 2014Lg Electronics Inc.Method to derive at least one motion vector of a bi-predictive block in a current picture
US20140098872 *Dec 11, 2013Apr 10, 2014Lg Electronics Inc.Method to derive at least one motion vector of a bi-predictive block in a current picture
US20140098873 *Dec 11, 2013Apr 10, 2014Lg Electronics Inc.Method to derive at least one motion vector of a bi-predictive block in a current picture
US20140098874 *Dec 11, 2013Apr 10, 2014Lg Electronics Inc.Method to derive at least one motion vector of a bi-predictive block in a current picture
US20140098875 *Dec 11, 2013Apr 10, 2014Lg Electronics Inc.Method to derive at least one motion vector of a bi-predictive block in a current picture
US20150264394 *Apr 28, 2015Sep 17, 2015Lg Electronics Inc.Method to derive at least one motion vector of a bi-predictive block in a current picture
EP2186341A1 *Apr 23, 2008May 19, 2010Samsung Electronics Co., Ltd.Method and apparatus for estimating and compensating spatiotemporal motion of image
EP2186341A4 *Apr 23, 2008Apr 11, 2012Samsung Electronics Co LtdMethod and apparatus for estimating and compensating spatiotemporal motion of image
WO2006014057A1 *Jul 29, 2005Feb 9, 2006Daeyang FoundationMethod, medium, and apparatus predicting direct mode motion of a multi-angle moving picture
WO2009056752A1 *Oct 27, 2008May 7, 2009Ateme SaMethod and system for estimating future motion of image elements from the past motion in a video coder
Classifications
U.S. Classification375/240.12, 375/E07.25, 375/E07.211, 375/E07.165, 375/E07.119, 375/E07.176, 375/E07.258, 375/E07.266, 375/E07.133, 375/E07.262
International ClassificationH03M7/36, G06T9/00, H04N19/593, H04N7/12
Cooperative ClassificationH04N19/61, H04N19/593, H04N19/577, H04N19/573, H04N19/56, H04N19/513, H04N19/51, H04N19/176, H04N19/142, H04N19/137, H04N19/105, H04N19/102
European ClassificationH04N7/26A6C4, H04N7/26A4, H04N7/26M2, H04N7/26A4B, H04N7/26A8B, H04N7/36C2, H04N7/34B, H04N7/46E, H04N7/26M4I, H04N7/26A6C6, H04N7/50, H04N7/36C8
Legal Events
DateCodeEventDescription
Aug 11, 2003ASAssignment
Owner name: MICROSOFT CORPORATION, WASHINGTON
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:TOURAPIS, ALEXANDROS;LI, SHIPENG;WU, FENG;REEL/FRAME:014372/0935;SIGNING DATES FROM 20030606 TO 20030611
Jan 15, 2015ASAssignment
Owner name: MICROSOFT TECHNOLOGY LICENSING, LLC, WASHINGTON
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MICROSOFT CORPORATION;REEL/FRAME:034766/0001
Effective date: 20141014