US RE38078 E1 Abstract Apparatus and method for a Parallel Query Z-coordinate Buffer are described. The apparatus and method perform a keep/discard decision on screen coordinate geometry before the geometry is converted or rendered into individual display screen pixels by implementing a parallel searching technique within a novel z-coordinate buffer based on a novel magnitude comparison content addressable memory (MCCAM) structure. The MCCAM provides means structure and method for performing simultaneous arithmetic magnitude comparisons on numerical quantities. These arithmetic magnitude comparisons include arithmetic less-than, greater-than, less-than-or-equal to, and greater-than-or-equal-to operations between coordinate values of a selected graphical object and the coordinate values of other objects in the image scene which may or may not occult the selected graphical object. Embodiments of the method and apparatus utilizing variations The structure and method support variations and combinations of bounding box occulting tests, vertex bounding box occulting tests, span occulting tests, and raster-write occulting tests, as well as combinations of these tests are described .
Claims(122) 1. A method of writing graphics images from a set of geometry data stored in a data memory to rasterized pixel data stored in a frame buffer for presentation on a display screen, said method comprising the steps of:
storing a number for use as a z-coordinate value with each image pixel location represented by said frame buffer in a magnitude comparison content addressable memory buffer;
selecting a geometry data item having at least one z-coordinate value from said set of geometry data stored in said data memory;
generating a bounding box around said selected geometry data;
determining a minimum z-coordinate value of said selected geometry data;
simultaneously performing for each of a plurality of pixels defining said selected geometry item, prior to generation of pixel raster data from said selected geometry data item, an arithmetic magnitude comparison between said minimum z-coordinate value and said previously stored z-coordinate values associated with each of said plurality of pixels pixel location of previously selected geometry data currently stored on a pixel-by-pixel basis in said magnitude comparison content addressable memory buffer; and
discarding on a pixel-by-pixel basisidentifying, prior to generation of pixel raster data from said selected geometry data, a portionpixels of said selected geometry data for which said minimum z-coordinate value is greater than said previously stored z-coordinate values within said bounding box so that said discarded portion ; and ignoring said identified pixels of said selected geometry data is when writing said geometry data as raster data to said frame buffer so that said identified pixels are not written to said frame buffer.
2. The method of
3. The method of
replacing said previously stored z-coordinate values in said magnitude comparison content addressable memory buffer with the z-coordinate values of said selected geometry data when said minimum z-coordinate value is less than said previously stored z-coordinate values.
4. The method in
5. The method of
6. The method of
storing a plurality of words, each of said words comprising a plurality of data fields, each of said data fields being divided into a plurality of data bits;
providingreceiving an input comprising a plurality of input fields matching some of said data fields of said words, and dividing each said input field of said received input into input bits so as to have a one-to-one bit correspondence to the said data bits in said data fields in of said stored words;
simultaneously comparing said plurality of input fields to all said words such that each said data field is compared to its corresponding said input, and for simultaneously generating a query result for each said word which is true when all said data fields within said word which are compared to one of said inputs compare favorably to each corresponding said input;
flag memory means for storing a flag bit equal to said query result for each of said words;
writing data to multiple words of said magnitude comparison content addressable memory simultaneously; and
performing multiple simultaneous queues of words of said magnitude comparison content addressable memory.
7. The method of
8. The method of
x,y)=0 if [(x<x _{min})(x>x _{max})(y<y _{min})(y>y _{max})(z<z _{min})]. Pixel(x,y)=(0 if [x<x _{min}) (x>x _{max}) (y>y _{max}) (z<z _{min})]. 9. The method of
_{min}, x>x_{max}, y<y_{min}, y>y_{max}, and z<z_{min}.10. The method of
means for performing the an arithmetic less-than operation between said x and said x
_{min }to generate an x_{min }comparison result; means for performing the an arithmetic greater-than operation between said x and said x
_{max }to generate an x_{max }comparison result; means for performing the an arithmetic less-than operation between said y and said y
_{min }to generate an y_{min }comparison result; means for performing the an arithmetic greater-than operation between said y and said y
_{max }to generate an y_{max }comparison result; means for performing the an arithmetic less-than operation between said z and said z
_{min }to generate an z_{min }comparison result; and logic means receiving said x
_{min} , x _{max} , y _{min} , y _{max} , and z _{min comparison results and generating an occultation result based on said received }comparison results and predetermined rules. 11. The method of
_{min }comparison is performed simultaneously including comparisons for pixels outside said projected bounding box; whereby the value of occulted is determined in an amount of time independent of both the number of pixels or the size of the projected bounding box.12. The method of
terminating said occulting test at any time occulted is set to false; and
testing only pixels within a said projected bounding box for which x
_{min}≦x≦x_{max }and y_{min}≦y≦y_{max}. 13. The method in
values while maintaining said x-fields and said y-fields x-values and y-values of said comparisons fixed.14. The method in
15. The method in
16. The system method of
image blocks. 17. A graphics rendering system comprising:
a magnitude comparison content addressable memory for (i) receiving coordinate values including numerical z-coordinate values of corresponding geometry data, and (ii) for performing simultaneous arithmetic magnitude comparisons between a plurality of said received coordinate values with previously stored coordinate values in said magnitude comparison content addressable memory, and (iii) for generating an occulting signal representing results from said plurality of arithmetic magnitude comparisons of said received coordinate values with said stored coordinate values, and (iv) for updating said stored coordinate values in response to a result of said arithmetic magnitude comparisons;
a processor coupled to said magnitude comparison content addressable memory for processing said geometry data responsive to said generated occulting signal from said magnitude comparison content addressable memory; and
a frame buffer coupled to said processor for storing said processed geometry data from said processor.
18. The graphics rendering system recited in
a z-buffer for storing actual z-coordinate values corresponding to said coordinate values of said geometry data from said processor; and wherein said z-coordinate values stored in said magnitude comparison content addressable memory are approximate z-coordinate values of said geometry data, said approximate z-coordinate values being greater-than-or-equal-to said actual z-coordinate value so that when a new geometry data is compared to said stored approximated z-coordinate value then said new geometry data is never declared occulted based on said approximated Z-coordinate value if it would not be declared occulted based on said actual z-coordinate value.
19. The graphics rendering system of
means for storing a plurality of words, each of said words comprising a plurality of data fields, each of said data fields being divided into a plurality of data bits;
means for providing an input to said MCCAM, said input comprising a plurality of input fields matching some of said data fields, each said input being divided into input bits so as to have a one-to-one bit correspondence to said data bits in said data fields;
query means for simultaneously comparing said plurality of input fields to all of said words so that each said data field is compared to its corresponding input field, and for simultaneously generating a query result for each said word which result is true only when all said data fields within said word compare favorably to each corresponding said input; said query means including arithmetic magnitude comparator means associated with each said word storing a pixel z-value for comparing said stored pixel z-value with a reference value; and
flag memory means for storing a flag bit equal to said query result for each of said words.
20. The system of
said magnitude comparison content addressable memory further comprises means for writing data to multiple words of said magnitude comparison content addressable memory simultaneously; and
means for performing multiple simultaneous queries of words of said magnitude comparison content addressable memory;
whereby the number of clock cycles needed to process geometry data is reduced .
21. The graphics rendering system in
means for performing simultaneous arithmetic magnitude comparisons between all of said received coordinate values with said previously stored coordinate values in said magnitude comparison content addressable memory; and
wherein said arithmetic magnitude comparisons operations are selected from the group consisting of a greater-than arithmetic operation, a greater-than-or-equal-to arithmetic operation, a less-than arithmetic operation, and a less-than-or-equal-to arithmetic operation.
22. A method of converting imagery in the form of a set of geometry data stored in a data memory to rasterized pixel data stored in a frame buffer for presentation on a display screen, said method comprising the steps of:
storing a number, for use as a z-coordinate value with each of a plurality of image pixels represented by said frame buffer, in a storage location within a magnitude comparison content addressable memory buffer;
selecting a geometry data item having at least one z-coordinate value from said set of geometry data;
generating an approximating boundary characterization of said selected geometry data based on coordinate parameters of said selected geometry data;
determining a minimum z-coordinate value of said selected geometry data;
simultaneously performing, for each of pixel in said image said plurality of pixels, an arithmetic magnitude comparison test between said geometry object data minimum z-coordinate value and said stored z-coordinate values of said image pixels; and
discardingidentifying on a pixel-by-pixel basis, any of said pixels forming a portion of said selected geometry data for which said minimum z-coordinate value is greater than said previously stored z-coordinate values within said boundary characterization prior to rasterization of said selected geometry data; and
rasterizing only portions of said geometry data not identified as having said minimum z-coordinate value greater than said previously stored z-coordinate. 23. The method of
setting, for each of said pixels prior to performing said comparison test, an occulted parameter to true to indicate that the geometrical location associated with said pixel is occulted, independent of whether said pixel is actually occulted or not occulted;
determining the current z-coordinate value for each said pixel;
generating an approximating boundary characterization of said selected geometry data based on coordinate parameters of said selected geometry data;
determining whether said pixel is within a projection of said projected bounding box generated approximating boundary characterization for said selected geometry data, and if said pixel is within said projected bounding box projection, then comparing said current z-coordinate value for said pixel to said geometry data minimum z-
coordinate value z_{min}; if z
_{min }is not greater that than said z-coordinate value of said evaluating pixel, then setting said occulted parameter to false for said pixel indicating that said new geometry data may not be occulted and required requires additional processing to determine whether said geometry data is occulted or not; and wherein every one of said pixels is evaluated for occultation simultaneously in a predetermined number of clock cycles independent of the number of pixels and independent of the size of the projected bounding box said approximating boundary characterization for said selected geometry data.
24. The method of
25. The method of
terminating said bounding box occulting magnitude comparison test whenever a pixel is determined not to be occulted and said occulted value is set to false; and
testing only pixels within said projected bounding box approximating boundary characterization for said selected geometry data rather than all of said pixels within said display screen.
26. The method of
27. The method of
storing geometrical parameters for each pixel in a word in a data structure, said word including an x-field for storing and comparing the pixel x-value, a y-field for storing and comparing the pixel y-value, a z-field for storing and comparing the pixel z-value, and a logic field for storing and generating result signals including a pixel miss signal.
28. The method of
whereby said infinity flag field for all pixels may being set in parallel and thereby eliminate eliminating any need to separately write each z-value z-
field with a large numerical z-value; said infinity flag field being cleared whenever a new z-value is written to the associated pixel; and
wherein said arithmetic magnitude comparison test comprises applying a said comparison test to all of said pixels previously stored in said magnitude comparison content addressable memory (MCCAM) z-buffer, and generating said a first boolean value for each pixel when the following formula is satisfied as false:
_{min})(x>x_{max})(y<y_{min})(y>y_{max}){(z<z_{min})NOT(Infinity_Flag)}]. x,y)=(0 if [x<x _{min}) (x>x _{max}) (y<y _{min}) (y>y _{max}){(z<z _{min})NOT(Infinity _{—} Flag)}].29. The method of
tagging pixels generating a said first boolean value indicating said pixel is not occuoted occulted for a subsequent processor to use in rendering temporally related scenes.
30. The method of
grouping said tagged pixels together into segments along a display raster line.
31. A method of converting geometry data having coordinates in a three-
dimensional scene to pixels having color values in a two-dimensional frame, the method comprising: (
a) storing pixel distance values, at least one of the pixel distance values being associated with a pixel location in the frame; (
b) for each of a plurality of multiple-pixel cells, generating and storing at least one cell distance value, each cell distance value being a single value representing all of the pixel distance values of the corresponding pixel locations within the cell; (
c) selecting a subset of the geometry data; (
d) determining, for the selected subset, a geometry distance value that represents the closest distance to the geometry data in the selected subset; (
e) simultaneously performing a plurality of magnitude comparisons between the geometry distance value and a plurality of stored cell distance values on a pixel-by-pixel basis; (
f) indicating the selected subset as hidden, on a cell-by-cell basis, if the geometry distance value is greater than the stored cell distance values; (
g) for any geometry data indicated as not hidden on a cell-by-cell basis, generating pixel-based depth values to determine if the geometry is hidden on a pixel-by-pixel basis; (
h) for any geometry data indicated as hidden on a cell-by-cell basis, omitting generation of color values and pixel-based depth values; and (
i) generating cell-based depth values from pixel-based depth values, storing the cell-based depth values for use in step (e).32. The method of
g) further comprising: for any geometry data not indicated as hidden on a cell-by-cell basis, generating color values for pixels not hidden on a pixel-by-pixel basis. 33. The method of
g).34. The method of
a) through (g) are performed once for each of a plurality of subsets of the geometry data. 35. The method of
a) through (f) are performed once for each of a plurality of subsets of the geometry data and then subsequently performing step (g).36. The method of
processing multiple frame instances, each frame instance being generated for a plurality of multiple-cell blocks, each block being generated by performing steps (a) through (f) once for each a particular plurality of subsets of the geometry data and then subsequently performing step (g), the particular plurality of subsets being those that potentially effect the block being generated. 37. The method of
cell blocks are generated in parallel. 38. The method of
a) through (f) for each block, at least one tag value is stored identifying an associated subset of geometry data, if the associated subset of geometry is not indicated as hidden at a pixel location. 39. The method of
by-pixel basis. 40. The method of
for any geometry data not indicated as hidden on a cell-by-cell basis, generating color values for pixels not hidden on a pixel-by-pixel basis; and the generating color values step further includes retrieving each previously stored tag value on a pixel-by-pixel basis, each retrieved tag value identifying the geometry subset responsible for the final color of the corresponding pixel, and using the identified geometry subset to generate the final color value. 41. The method of
prior to the simultaneously performing step, choosing a contiguous plurality of cells, the shape, and location as to approximate a boundary characterization of the selected subset, and
during the simultaneously performing step, only considering cell distance values for cells in the contiguous plurality of cells.
42. The method of
43. The method of
prior to the selecting subset step, processing the frame in sub-frame blocks of pixels, the sub-frame blocks of pixels having rectangular shape and a substantially uniform size where the size is determined based on the number of magnitude comparisons that can be performed simultaneously. 44. The method of
e) is accomplished in a predetermined number of clock cycles. 45. The method of
e) is accomplished in a one clock cycle. 46. The method of
e) includes performing the magnitude comparisons using a magnitude comparison content addressable memory. 47. The method of
c) and (d) are performed once for each of a plurality of subsets of the geometry data and then subsequently performing steps (a), (b), (e), (f), and (g) for each of the plurality of subsets in increasing order of the corresponding geometry distance values. 48. The method of
49. The method of
50. The method of
51. The method of
processing multiple frame instances, each frame instance being generated for a plurality of multiple-cell blocks, each block being generated by repetitively performing steps (a) through (f) for geometry subsets of the geometry database that potentially effect the block, step (g) being postponed until all the geometry data has been selected. 52. The method of
cell blocks are generated in parallel. 53. The method of
54. The method of
55. The method of
b), the storing of cell distance values is performed in parallel for a second plurality of cells. 56. The method of
57. The method of
processing multiple frame instances, each frame instance being generated for a plurality of multiple-cell blocks, each block being generated by performing steps (a) through (g) once for each of a particular plurality of subsets of the geometry data, the particular plurality of subsets being those subsets that potentially effect the block being generated. 58. The method of
cell blocks are generated in parallel. 59. The method of
a) through (g) for each block, at least one tag value is stored identifying an associated subset of geometry data, if the associated subset of geometry is not indicated as hidden at a pixel location. 60. The method of
by-pixel basis. 61. The method of
completing a particular frame instance, and during the generating of each block of a subsequent frame, selecting those geometry subsets having associated previously stored tag values before selecting other geometry subsets.
62. The method of
by-cell basis. 63. The method of
64. The method of
65. The method of
66. The method of
67. The method of
68. The method of
buffer. 69. The method of
70. The method of
only. 71. The method of
72. The method of
73. A method of converting polygons having coordinates in a three-
dimensional scene to pixels having color values in a two-dimensional frame, the method comprising: (
a) storing pixel distance values, at least one of the pixel distance values being associated with each pixel location; (
b) for each of a plurality of multiple-pixel cells, generating and storing at least one cell distance value, each cell distance value being a single value representing all of the pixel distance values of the corresponding pixel locations within the cell; (
c) selecting at least one of the polygons; (
d) determining, for the selected polygon, a polygon distance value, the polygon distance value representing the smallest distance to the selected polygon; (
e) simultaneously performing a plurality of magnitude comparisons between the polygon distance value and a plurality of previously stored cell distance values on a pixel-by-pixel basis; (
f) indicating the polygon as hidden, on a cell-by-cell basis, if the stored distance value represents a location closer than the polygon distance value represents; (
g) for any polygon indicated as not hidden on a cell-by-cell basis, generating pixel-based depth values to determine if the geometry is hidden on a pixel-by-pixel basis; (
h) generating color values and pixel-based depth values, on a pixel-by-pixel basis, omitting any polygons indicated as hidden at that pixel; and (
i) generating cell-based depth values from pixel-based depth values, storing the cell-based depth values for use in step (e).74. A method of processing polygons having coordinates in a three-
dimensional scene and pixels in a two-dimensional frame, the method comprising: (
a) for each of a plurality of multiple-pixel cells, generating and storing at least one cell distance value, the cell distance value being generated from corresponding pixel distance values and not less than any of those corresponding pixel distance values; (
b) determining, for a selected polygon, a polygon distance value, the polygon distance value representing the smallest distance to the selected polygon; and (
c) simultaneously performing a plurality of magnitude comparisons between the polygon distance value and a plurality of previously stored cell distance values on a pixel-by-pixel basis. 75. The method of
(
d) indicating the selected polygon as hidden, on a cell-by-cell basis, if the stored distance value is closer. 76. The method of
d) of indicating the selected polygon as hidden, on a cell-by-cell basis, if the stored distance value is closer, comprises a step of indicating the selected polygon as hidden, on a cell-by-cell basis, if the stored distance value is closer to a selected viewing location. 77. The method of
(
e) for any geometry data indicated as hidden on a cell-by-cell basis, omitting generation of color values. 78. The method of
(
f) for any geometry data not indicated as hidden on a cell-by-cell basis, generating color values for pixels not hidden on a pixel-by-pixel basis. 79. A graphics processor comprising:
polygon input means, coupled to an external host processor, to receive polygon data, including coordinate data, associated with each of a plurality of polygons representing a three-dimensional scene; polygon processing means, coupled to the input means, to receive polygon coordinate values associated with a selected set of one or more polygons and generate a closest polygon distance value representing the closest distance to any point on the selected polygons;
pixel distance means to store pixel distance values associated with individual pixel locations included in a two-dimensional frame of pixels, pixel distance values being received from pixel output means; cell processing means, coupled to the pixel distance means, to receive pixel distance values associated with a cell group of pixel locations and generate at least one cell distance value representing all of the pixel locations in the selected cell;
parallel magnitude comparison content addressable memory means, coupled to the cell processing means and the polygon processing means, to receive and store a plurality of the generated cell distance values, and to simultaneously compare a plurality of the stored cell distance values with the closest polygon distance value and indicate on a pixel-by-pixel basis, for all of the pixel locations represented by each of the cells, if the selected set of polygons is hidden; and pixel output means, coupled to the input means and the parallel comparison means, to receive polygon date omitting polygons indicated as hidden at each pixel location and generate color values and pixel distance values for each pixel location in the frame.
80. The graphics processor of
81. The graphics processor of
82. The graphics processor of
frame blocks of pixels, the array size based on the available number of parallel comparison circuits. 83. The graphics processor of
84. The graphics processor of
tag means, coupled to the polygon processing means and the pixel output means, to receive and store tags identifying which polygons are responsible for pixel color in order to postpone generating color values for each pixel until all updating of pixel distance values is done.
85. The graphics processor of
86. The graphics processor of
87. The graphics processor of
88. The graphics processor of
89. The graphics processor of
90. A graphics processing system comprising:
a graphics processor including:
polygon input means, coupled to an external host processor, to receive polygon data, including coordinate data, associated with each of a plurality of polygons representing a three-dimensional scene; pixel distance means to store pixel distance values associated with individual pixel locations included in a two-dimensional frame of pixels, pixel distance values being received from pixel output means; parallel magnitude comparison content addressable memory means, coupled to the cell processing means and the polygon processing means, to receive and store a plurality of the generated cell distance values, and to simultaneously compare each of the stored cell distance values with the closest polygon distance value and indicate on a pixel-by-pixel basis, for all of the pixel locations represented by each of the cells, if the selected set of polygons is hidden; and pixel output means, coupled to the input means and the parallel comparison means, to receive polygon data omitting polygons indicated as hidden at each pixel location and generate color values and pixel distance values for each pixel location in the frame; and
a frame buffer, coupled to the graphics processor, to receive and store the generated color values for each pixel location in the frame.
91. A graphics processing system according to
a host processor configured to run graphics application software and manipulate geometry data that represents one or more three-dimensional scenes, each scene including polygon data, including coordinate data, associated with each of a plurality of polygons. 92. A graphics processing system comprising:
a plurality of graphics processors, each of the graphics processors including:
polygon input means, coupled to an external host processor, to receive polygon data, including coordinate data, associated with each of a plurality of polygons representing a three-dimensional scene; pixel distance means to store pixel distance values associated with individual pixel locations included in a two-dimensional frame of pixels, pixel distance values being received from pixel output means; parallel magnitude comparison content addressable memory means, coupled to the cell processing means and the polygon processing means, to receive and store a plurality of the generated cell distance values, and to simultaneously compare each of the stored cell distance values with the closest polygon distance value and indicate on a pixel-by-pixel basis, for all of the pixel locations represented by each of the cells, if the selected set of polygons is hidden; and a frame buffer, wherein the frame buffer includes a plurality of frame buffer blocks, each frame buffer block coupled to a corresponding graphics processor to receive and store the generated color values for each pixel location in a predetermined block portion of the frame.
93. A graphics processing system according to
a host processor configured to run graphics application software and manipulate geometry data that represents one or more three-dimensional scenes, each scene including polygon data, including coordinate data, associated with each of a plurality of polygons. 94. A graphics processor comprising:
polygon processing means, receiving polygon data associated with a selected set of one or more polygons representing a three-dimensional scene and generating a closest polygon distance value representing the closest distance to any point on the selected polygons; dimensional frame of pixels, pixel distance values being received from pixel output means; cell processing means receiving pixel distance values associated with a cell group of pixel locations included in a two-dimensional scene and generating at least one cell distance value; parallel magnitude comparison content addressable memory means, coupled to the cell processing means and the polygon processing means, to receive and store a plurality of the generated cell distance values, and to simultaneously compare each of the stored cell distance values with the closest polygon distance value and indicate on a pixel-by-pixel output means, coupled to the input means and the parallel comparison means, to receive polygon data omitting polygons indicated as hidden at each pixel location and generate color values and pixel distance values for each pixel location in the frame.
95. The graphics processor of
pixel output means, coupled to the input means and the parallel comparison means, to receive polygon data omitting polygons indicated as hidden at each pixel location and generate color values for each pixel location in the frame.
96. The graphics processor of
97. A graphics processor comprising:
a polygon processor, coupled to an external host processor to receive polygon data, including coordinate data, associated with each of a plurality of polygons representing a three-dimensional scene, including a control circuit that selects a set of one or more polygons and a geometry processor that computes a closest polygon distance value representing the closest distance to any point on the selected polygons; a cell processor processing cells on a pixel-by-pixel basis, coupled to a pixel distance buffer to receive pixel distance values associated with a cell group of pixel locations, including first arithmetic magnitude comparison content addressable memory comparators that identify, for each of a plurality of cells, a farthest cell distance value to represent all of the pixel locations in the each cell, and second arithmetic comparators that simultaneously compare each of the farthest cell distance values with the closest polygon distance value; and a pixel processor, coupled to the polygon processor to receive polygon data and coupled to the cell processor to receive comparison results, including a circuit that selectively omits generation of color values for hidden portions of polygons, including for polygons and cell groups of pixels where the closest polygon distance value is greater than the farthest cell distance value.
98. A graphics processor comprising:
a polygon processor, coupled to an external host processor to receive polygon data, including coordinate data, associated with each of a plurality of polygons representing a three-dimensional scene, including a control circuit that selects a set of one or more polygons and a geometry processor that computes a closest polygon distance value representing the closest distance to any point on the selected polygons; a pixel distance buffer holding pixel distance values associated with individual pixel locations included in a two-dimensional frame of pixels; a cell processor processing cells on a pixel-by-pixel basis, coupled to a pixel distance buffer to receive pixel distance values associated with a cell group of pixel locations, including first arithmetic magnitude comparison content addressable memory comparators that identify, for each of a plurality of cells, a farthest cell distance value to represent all of the pixel locations in the each cell, and second arithmetic comparator that simultaneously compare each of the farthest cell distance values with the closest polygon distance value; and 99. A method of converting geometry data having coordinates in a three-
dimensional scene to pixels having color values in a two-dimensional frame, the method comprising: (
a) storing pixel distance values, at least one of the pixel distance values being associated with a pixel location in the frame; (
b) for each of a plurality of multiple-pixel cells, generating and storing at least one cell distance value, each cell distance value being a single value representing all of the pixel distance values of the corresponding pixel locations within the cell; (
c) selecting a subset of the geometry data; (
d) determining, for the selected subset, a geometry distance value that represents the closest distance to the geometry data in the selected subset; (
e) simultaneously performing on a pixel-by-pixel basis in a magnitude comparison content addressable memory a plurality of magnitude comparisons between the geometry distance value and a plurality of stored cell distance values using a tiled array of identical circuit blocks, each of a plurality of the circuit blocks being coupled to a common input databus for the geometry distance value, each of a plurality of the circuit blocks having an individually addressable register for storing a respective cell distance value, each of a plurality of the circuit blocks having a magnitude comparator coupled to the geometry distance value and the respective cell distance value; (
f) indicating the selected subset as hidden, on a cell-by-cell basis, if the geometry distance value is greater than the stored cell distance values; (
g) for any geometry data indicated as not hidden on a cell-by-cell basis, generating pixel-based depth values to determine if the geometry is hidden on a pixel-by-pixel basis; (
h) for any geometry data indicated as hidden on a cell-by-cell basis, omitting generation of color values and pixel-based depth values; (
i) generating cell-based depth values from pixel-based depth values, storing the cell-based depth values for use in step (e); (
j) for any geometry data not indicated as hidden on a cell-by-cell basis, generating color values for pixels not hidden on a pixel-by-pixel basis; and (
k) processing multiple frame instances, each frame instance being generated for a plurality of multiple-cell blocks, each block being generated by repetitively performing steps (a) through (f) for geometry subsets of the geometry database that potentially effect the block, steps (g) through (i) being postponed until after the entire geometry database has been traversed; and wherein the geometry data comprises one or more of polygons, surfaces, volumes and objects; and
the cell distance value is not less than the maximum of the pixel distance values of the corresponding pixel locations within the cell.
100. A graphics processor comprising:
dimensional scene; pixel distance means, coupled to the polygon processing means, to receive and selectively store pixel distance values associated with individual pixel locations included in a two-dimensional frame of pixels, pixel distance values being received from pixel output means; by-pixel basis, for all of the pixel locations represented by each of the cells, if the selected set of polygons is hidden, wherein completion of the parallel comparison requires a predetermined number of clock cycles, and in a single clock cycle, one parallel comparison is completed and its result stored and another parallel comparison operation is started, and the stored cell distance values represent all the pixel locations corresponding to a rectangular array of pixels configured to process sub-frame blocks of pixels, the array size based on the available number of parallel comparison circuits; tag means, coupled to the polygon processing means and the pixel output means, to receive and store tags identifying which polygons are responsible for pixel color in order to postpone generating color values for each pixel until all updating of pixel distance values is done;
wherein the pixel distance means includes update means, coupled to the parallel comparison means, to receive and selectively store updated pixel distance values depending on cell magnitude comparison results.
101. A graphics rendering system comprising:
a magnitude comparison content addressable memory for:
(
i) receiving coordinate values including coordinate values of corresponding geometry data, and (
ii) performing depth comparisons in parallel, said depth comparison being at least part of simultaneous arithmetic magnitude comparisons between said received coordinate values of corresponding geometry data and previously stored coordinate values, thereby generating a plurality of comparison results, and (
iii) for generating an occulting signal representing said plurality of comparison results, and (
iv) for updating said stored coordinate values in response to said occulting signal; a processor coupled to said magnitude comparison content addressable memory for processing said geometry data responsive to said occulting signal to generate pixel-based data; and a frame buffer coupled to said processor for storing at least some of said generated pixel-based data. 102. The graphics rendering system of
based data needs to be generated. 103. A graphics rendering system comprising:
a set of memory bits organized into words, where each word can perform one or more arithmetic magnitude comparisons between the stored data and the input data, for:
(
i) receiving coordinate values including coordinate values of corresponding geometry data, and (
ii) performing depth comparisons in parallel, said depth comparisons being at least part of simultaneous arithmetic magnitude comparisons between said received coordinate values of corresponding geometry data and previously stored coordinate values, thereby generating a plurality of comparison results, and (
iii) for generating an occulting signal representing said plurality of comparison results, and (
iv) for updating said stored coordinate values in response to said occulting signal; a processor, coupled to said set of memory bits organized into words, for processing said geometry data responsive to said occulting signal to generate pixel-based data; and a frame buffer coupled to said processor for storing at least some of said generated pixel-based data. 104. The graphics rendering system of
based data needs to be generated. 105. A graphics rendering system comprising:
a magnitude comparison content addressable memory for:
(
i) receiving coordinate values including coordinate values of corresponding geometry data, and (
ii) performing depth comparisons in parallel, said depth comparisons being at least part of simultaneous arithmetic magnitude comparisons between said received coordinate values of corresponding geometry data and previously stored coordinate values, thereby generating a plurality of comparison results, and (
iii) for generating an occulting signal representing said plurality of comparison results, and (
iv) for updating said stored coordinate values in response to said occulting signal; a processor coupled to said magnitude comparison content addressable memory for processing said geometry data responsive to said occulting signal to generate pixel-based data and cell-based data, said cell-based data used in said updating said stored coordinate values in response to said occulting signal; and a frame buffer coupled to said processor for storing at least some of said generated pixel-based data. 106. The graphics rendering system of
based data needs to be generated. 107. A graphics rendering system comprising:
a set of memory bits organized into words, where each word can perform one or more arithmetic magnitude comparisons between the stored data and the input data, for
(
i) receiving coordinate values including coordinate values of corresponding geometry data, and (
ii) performing depth comparisons in parallel, said depth comparisons being at least part of simultaneous arithmetic magnitude comparisons between said received coordinate values of corresponding geometry data and previously stored coordinate values, thereby generating a plurality of comparison results, and (
iii) for generating an occulting signal representing said plurality of comparison results, and (
iv) for updating said stored coordinate values in response to said occulting signal; a processor, coupled to said set of memory bits organized into words, for processing said geometry data responsive to said occulting signal to generate pixel-based data and cell-based data, said cell-based data used in said updating said stored coordinate values in response to said occulting signal; and a frame buffer coupled to said processor for storing at least some of said generated pixel-based data. 108. The graphics rendering system of
based data needs to be generated. 109. A graphics rendering method comprising the steps:
(
a) receiving coordinate values including coordinate values of corresponding geometry data; (
b) performing depth comparisons in parallel using a magnitude comparison content addressable memory, said depth comparisons being at least part of simultaneous arithmetic magnitude comparisons between said received coordinate values of corresponding geometry data and previously stored coordinate values, thereby generating a plurality of comparison results; (
c) generating an occulting signal representing said plurality of comparison results; (
d) updating said stored coordinate values in response to said occulting signal; and (
e) processing said geometry data responsive to said occulting signal to generate pixel-based data. 110. The method of
(
f) storing at least some of said generated pixel-based data into a frame buffer. 111. The method of
c) further comprising generating hit information, said hit information indicating which pixel-based data needs to be generated in step (e).112. A graphics rendering method comprising the steps:
(
a) receiving coordinate values including coordinate values of corresponding geometry data; (
b) performing depth comparisons in parallel using a set of memory bits organized into words, where each word can perform one or more arithmetic magnitude comparisons between the stored data and the input data, said depth comparisons being at least part of simultaneous arithmetic magnitude comparisons between said received coordinate values of corresponding geometry data and previously stored coordinate values, thereby generating a plurality of comparison results; (
c) generating an occulting signal representing said plurality of comparison results; (
d) updating said stored coordinate values in response to said occulting signal; and (
e) processing said geometry data responsive to said occulting signal to generate pixel-based data. 113. The method of
(
f) storing at least some of said generated pixel-based data into a frame buffer. 114. The method of
c) further comprising generating hit information, said hit information indicating which pixel-based data needs to be generated in step (e).115. A graphics rendering method comprising the steps:
(
a) receiving coordinate values including coordinate values of corresponding geometry data; (
b) performing depth comparisons in parallel using a magnitude comparison content addressable memory, said depth comparisons being at least part of simultaneous arithmetic magnitude comparisons between a plurality of said received coordinate values and previously stored coordinate values, thereby generating a plurality of comparison results; (
c) generating an occulting signal representing said plurality of comparison results; (
d) updating said stored coordinate values in response to said occulting signal; and (
e) processing said geometry data responsive to said occulting signal to generate pixel-based data and cell-based data, said cell-based data generated from pixel-based data. 116. The method of
based data is used in step (d) to update said stored coordinate values. 117. The method of
(
f) storing at least some of said generated pixel-based data into a frame buffer. 118. The method of
c) further comprising generating hit information, said hit information indicating which pixel-based data needs to be generated in step (e).119. A graphics rendering method comprising the steps:
(
a) receiving coordinate values including coordinate values of corresponding geometry data; (
b) performing depth comparisons in parallel using a set of memory bits organized into words, where each word can perform one or more arithmetic magnitude comparisons between the stored data and the input data, said depth comparisons being at least part of simultaneous arithmetic magnitude comparisons between a plurality of said received coordinate values and previously stored coordinate values, thereby generating a plurality of comparison results; (
c) generating an occulting signal representing said plurality of comparison results; (
d) updating said stored coordinate values in response to said occulting signal; and (
e) processing said geometry data responsive to said occulting signal to generate pixel-based data and cell-based data, said cell-based data generated from pixel-based data. 120. The method of
based data is used in step (d) to update said stored coordinate values. 121. The method of
(
f) storing at least some of said generated pixel-based data into a frame buffer. 122. The method of
c) further comprising generating hit information, said hit information indicating which pixel-based data needs to be generated in step (e).Description The field of this invention is twofold: 1) three-dimensional computer graphics, and more specifically, hidden surface removal in three-dimensional rendering; and 2) computer memories, and more specifically, Content Addressable Memories (CAM). Three-dimensional Computer Graphics Computer graphics is the art and science of generating pictures with a computer. Generation of pictures is commonly called rendering. Generally, in three-dimensional computer graphics, geometry that represents surfaces (or volumes) of objects in a scene is translated into pixels stored in a frame buffer, and then displayed on a display device, such as a CRT. Pixels may have a direct one-to-one correspondence with physical display device hardware, but this is not always the case. Some three-dimensional graphics systems reduce aliasing with a frame buffer that has multiple pixels per physical display picture element. Other 3D graphics systems, in order to reduce the rendering task, have multiple physical display picture elements per pixel. In this document, “pixel” refers to the smallest individually controllable element in the frame buffer, independent of the physical display device. The display screen When a piece of 3D geometry is projected onto a display screen A summary of the rendering process can be found in: “Fundamentals of Three-dimensional Computer Graphics”, by Watt, Chapter 5: The Rendering Process, pages 97 to 113, published by Addison-Wesley Publishing Company, Reading, Mass., 1989, reprinted 1991, ISBN 0-201-15442-0 (hereinafter referred to as the Watt Reference). An example of a hardware renderer is incorporated herein by reference: “Leo: A System for Cost Effective 3D Shaded Graphics”, by Deering and Nelson, pages 101 to 108 of SIGGRAPH 93 Proceedings, 1-6 Aug. 1993, Computer Graphics Proceedings, Annual Conference Series, published by ACM SIGGRAPH, New York, 1993, Softcover ISBN 0-201-58889-7 and CD-ROM ISBN 0-201-56997-3 (hereinafter referred to as the Deering Reference). The Deering Reference describes a generic 3D graphics pipeline (i.e., a renderer, or a rendering system) as “truly generic, as at the top level nearly every commercial 3D graphics accelerator fits this abstraction”, and this pipeline diagram is reproduced here as FIG. The Pixel Drawing Pipeline In computer graphics, each renderable object generally has its own local object coordinate system, and therefore needs to be translated from object coordinates to pixel display coordinates. Conceptually, this is a 4-step process: 1) translation (including scaling for size enlargement or shrink) from object coordinates to world coordinates, which is the coordinate system for the entire scene; 2) translation from world coordinates to eye coordinates, based on the viewing point of the scene; 3) translation from eye coordinates to perspective translated eye coordinates, where perspective scaling (farther objects appear smaller) has been performed; and 4)translation from perspective translated eye coordinates to pixel coordinates, also called screen coordinates. These translation steps can be compressed into one or two steps by precomputing appropriate translation matrices before any translation occurs. FIG. 3 shows a three-dimensional object, a tetrahedron, with its own coordinate axes (x Once the geometry is in screen coordinates, it is rasterized, which is the process of generating actual pixel color values. Many techniques are used for generating pixel color values, including Gouraud shading, Phong shading, and texture mapping. In some systems, the Frame Buffer Because many different portions of geometry can affect the same pixel, the geometry representing the surfaces closest to the scene viewing point must be determined. Thus, for each pixel, the closest surface to the viewing point determines the pixel color value, and the other more distant surfaces which could affect the pixel are hidden and are prevented from affecting the pixel. An exception to this rule occurs when non-opaque surfaces are rendered, in which case all non-opaque surfaces closer to the viewing point than the closest opaque surface affect the pixel color value, while all other non-opaque surfaces are discarded. In this document, the term “occulted” is used to describe geometry which is 100% hidden by other non-opaque geometry. As a rendering process proceeds, the renderer must often recompute the color value of a given screen pixel multiple times, because there may be many surfaces that intersect the volume subtended by the pixel. The average number of times a pixel needs to be rendered, for a particular scene, is called the depth complexity of the scene. Simple scenes have a depth complexity near unity, while complex scenes can have a depth complexity of ten or twenty. As scene models become more and more complicated, renderers will be required to process scenes of ever increasing depth complexity. Many techniques have been developed to perform visible surface determination, and a survey of these techniques are incorporated herein by reference to: “Computer Graphics: Principles and Practice”, by Foley, van Dam, Feiner, and Hughes, Chapter 15: Visible-Surface Determination, pages 649 to 720, 2nd edition published by Addison-Wesley Publishing Company, Reading, Mass., 1990, reprinted with corrections 1991, ISBN 0-201-12110-7 (hereinafter referred to as the Foley Reference). When a point on a surface (frequently a polygon vertex) is translated to screen coordinates, the point has three coordinates: 1) the x-coordinate of the affected pixel; 2) the y-coordinate of the affected pixel; and 3) the z-coordinate of the point in either eye coordinates, distance from the virtual screen, or some other coordinate system which preserves the relative distance of surfaces from the viewing point. In this document, positive z-coordinate values are used for the “look direction” from the viewing point, and smaller values indicate a position closer to the viewing point. For example, if a surface is approximated by a set of planar polygons, the vertices of each polygon are translated to screen coordinates. For points in or on the polygon (other than the vertices), the screen coordinates are interpolated from the coordinates of vertices, typically by the processes of edge walking and span interpolation, as discussed in the Deering Reference. Thus, a z-coordinate value is included in each pixel value (along with the color value) as geometry is rendered. The most common method for visible surface determination, or conversely, for hidden surface removal, is the Z-buffer. Another common hidden surface removal technique is called backface culling (see Foley Reference, page 663), which eliminates polygons from rendering before they are converted into pixels. Backface culling is generally included in the face determination Z-buffer Stated simply, the Z-buffer stores, for every pixel, the z-coordinate of the pixel within the closest geometry (to the viewing point) that affects the pixel. Hence, as new pixel values are generated, each new pixel's z-coordinate is compared to the corresponding location in the Z-buffer. If the new pixel's z-coordinate is smaller (i.e., closer to the viewing point), this value is stored into the Z-buffer and the new pixel's color value is written into the frame buffer. If the new pixel's z-coordinate is larger (i.e., farther from the viewing point), the frame buffer and Z-buffer values are unchanged and the new pixel is discarded. Method pseudocode for the Z-buffer method is shown in Appendix 1, which is a slightly modified version of FIG. 15.21 in the Foley Reference. The pixel loop A A flow diagram of the prior art Z-buffer method is shown in FIG. One drawback to the Z-buffer hidden surface removal method is the requirement for geometry to be converted to pixel values before hidden surface removal can be done. This is because the keep/discard decision is made on a pixel-by-pixel basis, rather than at a higher level, such as at the level of the geometry in screen coordinates, which is accomplished by the present invention. Prior art Z-buffers are based on conventional Random Access Memory (RAM) or Video RAM (VRAM). High performance prior art Z-buffers employ many different techniques, such as page-mode addressing and bank interleaving, to interrogate as many Z-buffer memory locations per second as possible. The interrogation process is needed to perform the keep/discard decision on a pixel-by-pixel basis as geometry is rasterized. One major drawback to the prior art Z-buffer is its inherently pixel-sequential nature. For scenes with high depth complexity, access to the Z-buffer is a bottleneck which limits performance in renderers. Temporal Correlation Many applications of 3D computer graphics generate a sequence of scenes in a frame-by-frame manner. If the frame rate of the sequence is sufficiently high (this is generally the case), then the present scene looks very much like the previous scene, and the only differences are due to movement of objects or light sources within the scene or movement of the viewing point. Thus, consecutive scenes are similar to each other due to their temporal correlation. Identifying the non-occulted geometry from the previous scene can help with the rendering of the present scene because such non-occulted geometry can be rendered first. Then, when geometry which was occulted in the previous scene undergoes hidden surface removal, most of it can be discarded before pixel color computations need to be done. Prior art rendering systems do not gain much from taking advantage of temporal correlation because they will only save computations at the very end of the graphics pipeline On top of this, taking advantage of temporal correlation is difficult in prior art rendering systems because, the “backward link” from the final values in the Z-buffer and frame buffer back to the geometry database is difficult to construct. In other words, prior art rendering systems smash geometry into separate and independent pixels, and taking advantage of temporal correlation requires knowing which pieces of geometry generated the pixels which survived the keep/discard decisions when an entire scene has completed the rendering process. Geometry Databases The geometry needed to generate a renderable scene is stored in a database. This geometry database can be a simple display list of graphics primitives or a hierarchically organized data structure. In the hierarchically organized geometry database, the root of the hierarchy is entire database, and the first layer of subnodes in the data structure is generally all the objects in the “world” which can be seen from the viewpoint. Each object, in turn, contains subobjects, which, in turn, contain subsubobjects; thus resulting in a hierarchical “tree” of objects. Hereinafter, the term “object” shall refer to any node in the hierarchial tree of objects. Thus, each subobject is an object. The term “root object” shall refer to a node in the first layer of subnodes in the data structure. Hence, the hierarchical database for a scene starts with the scene root node, and the first layer of objects are root objects. Hierarchical databases of this type are used by the Programmer's Hierarchical Interactive System (PHIGS) and PHIGS PLUS standards An explanation of these standards can be found in the book, “A Practical Introduction to PHIGS and PHIGS PLUS”, by T. L. J. Howard, et. al., published by Addison-Wesley Publishing Company, 1991, ISBN 0-201-41641-7 (incorporated herein by reference and hereinafter called the Howard Reference). The Howard Reference describes the hierarchical nature of 3D models and their data structure on pages 5 through 8. Content Addressable Memories Most Content Addressable Memories (CAM) perform a bit-for-bit equality test between an input vector and each of the data words stored in the CAM. This type of CAM frequently provides masking of bit positions in order to eliminate the corresponding bit in all words from affecting the equality test. It is inefficient to perform magnitude comparisons in a equality-testing CAM because a large number of clock cycles is required to do the task. CAMs are presently used in translation look-aside buffers within a virtual memory systems in some computers. CAMs are also used to match addresses in high speed computer networks. CAMs are not used in any practical prior art renders. Magnitude Comparison CAM (MCCAM) is defined here as any CAM where the stored data are treated as numbers, and arithmetic magnitude comparisons (i.e. less-then, greater-than, less-than-or-equal-to, etc.) are performed in parallel. This is in contrast to ordinary CAM which treats stored data strictly as bit vectors, not as numbers. An MCCAM patent, included herein by reference, is U.S. Pat. No. 4,996,666, by Jerome F. Duluk Jr., entitled “Content-Addressable Memory System Capable of Fully Parallel Magnitude Comparisons”, granted Feb. 26, 1991 (hereinafter referred to as the Duluk Patent). Structures within the Duluk Patent specifically referenced shall include the prefix “Duluk Patent”, e.g. “Duluk Patent MCCAM Bit Circuit”. MCCAMs are not used in any prior art renderer. The basic internal structure of an MCCAM is a set of memory bits organized into words, where each word can perform one or more arithmetic magnitude comparisons between the stored data and input data. The method and apparatus of this document enhance the performance of the prior art Pixel Drawing Pipeline The method and apparatus presented here perform the keep/discard decision on screen coordinate geometry before it is converted into individual pixels. This is done by utilizing parallel searching within a new type of Z-buffer based on a new type of Magnitude Comparison Content Addressable Memory (MCCAM), hereinafter called a MCCAM Z-buffer The MCCAM Z-buffer The basic internal structure of an MCCAM Z-buffer This document includes new methods for the keep/discard decision. The new methods, called Occulting Tests Occulting Tests The methods disclosed here include four types of Occulting Tests The Bounding Box Occulting Test The Vertex Bounding Box Test The Span Occulting Test The Raster Write Span Occulting Test Also described herein is the Dual Occulting Test Pixel Drawing Pipeline The invention takes advantage of temporal correlation in a sequence of scenes through the use of Tags. Tags are “backward links” (i.e., pointers) to the source geometry, and the main problems with such “backward links” are overcome by storing them into a CAM within the Tag MCCAM Z-buffer One Tag for every pixel (or Cell) in the display screen Tags, implemented in an equality-testing CAM circuit, can be added to renderers that use only the prior art Z-buffer for hidden surface removal. However, Tags are substantially more useful when used with an MCCAM Z-buffer Three specific types of MCCAM Z-buffers The Basic MCCAM Z-buffer The Vertex MCCAM Z-buffer The Tag MCCAM Z-buffer Five specific types of words within MCCAM Z-buffers The Basic MCCAM Word The Raster Write MCCAM Word The Vertex MCCAM Word The Hit Flag MCCAM Word The Tag MCCAM Word Each of these different MCCAM Words The MCCAM Word As an Occulting Test Also included in this document are descriptions of Pixel Drawing Pipelines Hardware that implements a Pixel Drawing Pipeline In prior art renderers, the span interpolation Some of the Pixel Drawing Subsystems Novel VLSI circuits FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. FIG. The methods presented in this document are described using both flow diagrams and pseudocode. Method flow diagrams are used for most aspects of the method, and method pseudocode is attached as a set of appendices and is used to describe the details of the method. Pseudocode is used in this document to describe a process performed by some type of apparatus, and all pseudocode contained herein is consistent in style to that of the Foley Reference. Variables and named constants are shown in italics; language reserved words (e.g. if, than, and, or, do, etc.) are shown in bold; and comments are shown in curly brackets. Each method pseudocode in the appendices has its lines numbered, where the numbering in each appendix's first line starts with one thousand times the appendix number along with the prefix “A”. The method pseudocode appendices include, for readability, vertical lines to show how both begin-end and repeat-until statements are paired together. Hierarchical method names (e.g. “WritePixel” at A Reference numbers are four or five digit numbers, and the first one or two digits are the figure number where the reference item is best illustrated. Modified Rendering Pipeline FIG. 1 shows the Generic 3D Graphics Pipeline The BBOT 3D Graphics Pipeline Hardware for implementing the Occulting Test Pixel Drawing Pipeline To implement the various versions of pixel drawing pipelines, six hardware pixel drawing subsystem architectures will be presented: 1) MCCAM Pixel Drawing Subsystem Bounding Box Occulting Test In simplified terms, the Bounding Box Occulting Test The axially aligned three-dimensional bounding box is an approximation for the three-dimensional surface of a piece of geometry. This approximation, for occulting purposes, introduces an error in the keep/discard decision at the Occulting Test FIG. 7 shows a piece of geometry
The geometry's projected bounding box
Thus, a projected bounding box is a minimum sized rectangle in the plane of the display screen Conceptually, both the Frame Buffer
The boolean value, PixelHit, is generated for each pixel. Pixels with PixelHit true are called “Hits”, and this condition indicates the new geometry may not be occulted at such pixels. In equations and method flow diagrams, the symbol “” is used to indicate assigning a variable's value (needed to distinguish it from “=” which is used here as an arithmetic test for equality). In all method flow diagrams, rhombuses are used for conditionals only, and do not assign values as do rectangles. The five values x If PixelHit is true for any pixel (i.e. all five inequalities are true), then the piece of geometry has failed the Bounding Box Occulting Test If PixelHit is not true for any pixels (PixelHit is false for all pixels), then the new piece of geometry passes the Bounding Box Occulting Test Thus, the Bounding Box Occulting Test The logical inverse of Equation 3 is:
which, if PixelMiss FIG. 8 is a rendering method flow diagram which includes the BBOT Pixel Drawing Pipeline method FIG. 9 is a method flow diagram for the Bounding Box Occulting Test FIG. 10 shows an example display screen Parallel Computation Apparatus Rather than perform either Equations 3 or Equation 4 sequentially pixel-by-pixel, it is desirable to perform the computation for all pixels in the display screen Parallel searching of z-values provides a major advantage over prior art Z-buffer techniques by drastically reducing the time it takes to determine if a piece of geometry is occulted. Entire complex objects can be tested to see if they are occulted before they undergo the computationally expensive process of conversion into separate pixels. If it is determined that an entire object is occulted, then all the computations normally required to convert they object into pixels can be completely avoided. FIG. 11 shows the data organization of the Basic MCCAM Z-buffer The prior art MCCAM in the Duluk Patent would not be appropriate for use as an MCCAM Z-buffer As shown in FIG. 11, Pixels are stored in row-by-row order in the MCCAM Z-buffer When a prior art Z-buffer is initialized before a scene starts to be rendered, all Z-buffer values must be set to the maximum in order to designate each pixel “blank” This means geometry at any distance will overwrite the Frame Buffer As an optional feature of the MCCAM Z-buffer
FIG. 12 shows a hardware implementation of a Basic MCCAM Word In the X-field Because, for a fixed display screen When a new z-value needs to be written into the Basic MCCAM Z-buffer The Basic MCCAM Word In a later section, information conveying which pixels were Hits is used as an aid in the rasterization process. For systems with this optional capability, this information is read from the MCCAM Z-buffer As an alternate approach to supplying the five parameters at once, the parameters can be supplied as two triplets:
Hence, each triplet of Equation 6 can be supplied and evaluated in a single clock cycle, thus requiring two clock cycles per bounding box. For the simple bounding box approach, this is inefficient since z FIG. 6 is a block diagram of a portion of a hardware rendering subsystem If the geometry is not occulted, the Rasterize Processor Raster Write Capability In the rendering subsystem of FIG. 6, the MCCAM Z-buffer The projection of a piece of geometry is generally not a rectangle. Therefore, when simultaneously writing a z-value to multiple words, a degenerate rectangle with a height of one pixel is used. This degenerate rectangle is one raster of pixels within the projection of the piece of geometry. Hence the name “Raster Write” is used. The degenerate rectangle is a span of the projected geometry, and is generated by edge walking In Raster Write MCCAM Words
When multiple simultaneous writes are desired, the following inputs are used: 1) x The Raster Write Control Raster Write capability is preferred over word-by-word writing because it increases performance. Hence, additional novel word types (explained in later sections) assume a Raster Write capability although they could be built with word-by-word writing. A rendering subsystem Another reason for having both a MCCAM Z-buffer Prior art rendering systems do not have the capability to simultaneously write the same value into multiple Z-buffer memory locations. This is because prior art systems do not include a separate memory to store approximate z-values. The apparatus described here stores approximate z-values into a memory, the MCCAM Z-buffer The circuits in the prior art MCCAM in the Culuk Patent are not good for writing to multiple words, thus are not able to perform the Raster Write operation. Scene Rendering and Hierarchical Object Occulting As described above, geometry databases are frequently hierarchically organized into a tree of objects. A major performance advantage rises from applying Occulting Tests In order to aid in generating a bounding box for an object, the database should include some bounding information. However, since the database generally contains geometry which is not yet translated to screen coordinates, or even world coordinates, any bounding information must pass through the spatial transformation process, which is from data input to screen space conversion As shown in FIG. 15, a scene is rendered by first initializing For each root object, a bounding box is made The method of FIG. 15 is naturally recursive due to the treelike nature of the database. However, recursion does not illustrate well with a flow diagram, so description of a recursive version is presented using method pseudocode in the appendices. Prior art systems cannot process entire objects to see if they are occulted. Prior art renderers must break objects into separate pixels, then test each pixel to see if it is occulted. The invention described here processes an entire object by performing operations on all pixels at once. The elimination of entire occulted objects saves a large fraction to the total computation, thereby allowing more complicated scenes to be rendered because more objects can be processed. Processing Vertex-based Geometry and Polygon Meshes FIG. 16 shows a polygon
A common technique for generation of a polygonally approximated surface is the use of polygon meshes, as described in the Deering Reference, the Watt Reference, and the Howard Reference. FIG. 17 shows a triangle mesh which uses six vertices to described four triangles. The four projected bounding boxes An alternative to independently supplying the parameters of the bounding boxes (four in the case of FIG. 17) is to operate directly on vertex coordinates, and the method to do so is hereinafter called the Vertex Bounding Box Occulting Test Equation 9 operates on three vertices within a triangle mesh: V
which has the same logical value. This expansion can be done because if x<x The simplest version of the Vertex MCCAM Z-buffer
The five one-bit results, L The rth Comparison Register Continuing with this example, another vertex could be input to form a triangle with the vertices whose Vertex Comparison Results For hardware to generate many different logic functions (such as Equation 12), some mechanism is needed for selecting which of the Comparison Registers For the Vertex Bounding Box Occulting Test Vertex Comparison Result If high order polygons (ones with more vertices) are allowed, where the number of vertices exceeds the number of Comparison Registers
New values in Comparison Register If the running conjunction capability of Equation 14 is included, then the minimum number of Comparison Registers A sequence of vertices composing multiple triangle meshes is illustrated in FIG. 19 (this figure is an adapted version of FIG. 3 of the Deering Reference). The first column of the table within FIG. 19 is the list of vertices The Vertex MCCAM Word Step 1: Getting as input, on the first cycle of the Raster Write operation, x Step 2: Getting as input, on the second clock cycle of the Raster Write operation, x Step 3: On the second clock cycle of the Raster Write operation, asserting both WrEn For a Raster Write operation, the inputs are the same except y
The VBBOT Pixel Drawing Pipeline method FIG. 21 is the method flow diagram for the Vertex Comparison FIG. 22 is the method flow diagram for the Vertex Bounding Box Occulting Test Prior art renderers generally process geometry databases composed of polygon meshes. It is very important for this invention to be compatible with existing geometry databases, so processing of polygon meshes must be included. In prior art renderers, as a polygon mesh is processed, every new vertex can cause an entire polygon to be generated (this is true for triangle meshes), thereby requiring lots of pixel values to be generated. This very-many-to-one mapping of pixels to polygons puts the bottleneck of prior art renderers in the pixel generating portion of the system. In this invention, as a vertex generates a new polygon, the polygon is immediately tested in a single clock cycle to see if it is occulted, thereby eliminating a large fraction of the pixels within these polygons. This invention reduces the very-many-to-one mapping, which reduces (or eliminates) the bottleneck, and increase performance. The prior art MCCAM in the Duluk Patent cannot perform the Vertex Bounding Box Occulting Test Methods for Reading Hits The method disclosed up to this point performs an Occulting Test As an optional feature, Hit information can be read from the MCCAM Z-buffer For discussion purposes, only VertexPixelMiss Individual Pixel Hits When a piece of renderable geometry is processed by the MCCAM Z-buffer If the Rasterize Processor Step 1: The stored Hit Flags, labeled Flg Value Step 2: The Priority Resolver and Encoder Step 3: In this highest priority Hit Flag MCCAM Word Step 4: Also in the same Hit Flag MCCAM Word Step 5: If a lower priority Hit Flag MCCAM Word Flg Value In step 3, clearing one Hit Flag in the Hit Flag Register If the Z-field FIG. 24 shows a Hit Flag MCCAM Word Multiple Hit Hag Registers Segment Hits The main drawback to reading every Pixel Hit out of the MCCAM Z-buffer In FIG. 25, a display screen The first example Segment Hit The second example Segment Hit The Raster Write capability described above is a mechanism for simultaneously changing a multiplicity of z-values stored in the MCCAM Z-buffer
where x In the Hit Flag MCCAM Word If the Start Pixel Hit and the End Pixel Hit are sequentially read, then the steps to read the Start Pixel Hit are: Step 1: The stored Hit Flags, labeled Flg Value Step 2: The Priority Resolver and Encoder Step 3: In this highest Hit Flag MCCAM Word If the Start Pixel Hit and the End Pixel Hit are sequentially read, then the steps to read the End Pixel Hit are: Step 1: The stored Hit Flags, labeled FlgValue Step 2: The Priority Resolver and Encoder Step 3: In all the Hit Flag MCCAM Words Step 4: Also in the same highest priority Hit Flag MCCAM Word Step 5: If a lower priority Hit Flag MCCAM Word Segment Hit Read capability would generally be used in conjunction with Raster Write capability; therefore, the Z-field An alternative definition of Segment Hit is to allow only one Segment Hit per display screen Raster Hits In FIG. 25, a display screen The example Raster Hit Raster Hits have a major advantage over Segment Hits and Pixel Hits in that a Raster Hit has only one parameter which needs to be output from the MCCAM Z-buffer Another advantage stemming from having the y-coordinate as the only parameter is simplification of the Priority Resolver and Encoder The steps for reading a Raster Hit are: Step 1: The stored Hit Flags, labeled FlgValue Step 2: The Priority Resolver and Encoder Step 3: The Priority Resolver and Encoder Step 4: In this highest priority Raster Set Step 5: The Priority Resolver and Encoder Step 6: If a lower priority Raster Set Reading Raster Hits this way does not require any of the coordinate fields, X-field FIG. 24 is a Hit Flag MCCAM Word The prior art MCCAM in the Duluk Patent includes a priority resolver for finding the first hit. However, it is not capable of identifying sets of hits, and therefore cannot find Segment Hits or Raster Hits. The Span Occulting Test For a piece of geometry to be discarded before it is rasterized, both the Bounding Box Occulting Test An alternative to the use of a bounding box is the use of spans, as described here for the Span Occulting Test
Only one y-parameter is needed for a span because the two pixels The Span Occulting Test
or, the inverse, PixelMiss
If PixelMiss If PixelMiss FIG. 27 is a method flow diagram for the Span Occulting Test The method of FIG. 27 utilizes either the Basic MCCAM Word When the Span Occulting Test When used as the only occulting test, the Span Occulting Test A version of the Span Occulting Test The Raster Write Span Occulting Test The SOT Pixel Drawing Pipeline method When the Span Occulting Test FIG. 30 is the method flow diagram for the Raster Write Span Occulting Test The MCCAM Z-buffer Step 1: The input on the first clock cycle of a Raster Write Span Occulting Test Step 2: The input on the second clock cycle of a Raster Write Span Occulting Test Step 3: Since the span has used two Comparison Registers Step 4: During the second clock cycle, the signals WrEn The logic in Raster Write Control If only spans are tested to see if they are occulted (and not polygons or meshes), the minimum number of Comparisons Registers The Dual Occulting Test The main drawback with the Span Occulting Test FIG. 32 shows the flow method for the Dual Occulting Test Pixel Drawing Pipeline method The above step which gets a span The minimum number of Comparisons Registers Prior art renderers perform pixel-by-pixel keep/discard decisions, requiring geometry to be decomposed all the way down to individual pixels before determining if they can be thrown away. This invention introduces three additional earlier levels where the keep/discard decisions are performed. The first is on the object (or, subobject) level, the second is at the polygon (or other renderable geometry) level, and the third is at the span level. This hierarchy of tests provides successively stricter and stricter filtration on geometry in the scene in order to avoid generation of individual pixels whenever possible. At each level in the hierarchy of tests, fewer and fewer pieces of geometry are left, while the ones which remain cost more and more to process. Thus, this invention makes efficient use of available computational ability at each level, while reducing computations otherwise wasted on items which are thrown away later. The Span FIFO The DOT Pixel Drawing Pipeline method The Span FIFO The use of the Span FIFO In prior art rendering systems, all polygon spans generated by edge walking FIG. 33 shows a portion of a 3D rendering system, including the Span FIFO Pixel Drawing Subsystem Separation of the Edge Walk Processor Organizing Pixels into Blocks The Pixel Drawing Subsystems FIG. 35 shows a display screen
where S A vertex has a location relative to every Block, and is located outside every Block except for one. To be outside of a Block, either of the coordinates relative to the Block must be either negative or greater than or equal to the number of pixels in the Block along the corresponding dimension. The Frame Buffer The simple Block partitioning shown in FIG. 35 is inefficient because a polygon will have a tendency to fall primarily within a small set of Blocks, and therefore, not evenly spread its rendering load over the set of Pixel Drawing Subsystems FIG. 36 shows an example of an Interleaved Block, where each Interleaved Block includes a portion of every 8th raster line. The translation of a vertex at screen coordinates (V
where S FIG. 37 shows a multiplicity of Block Pixel Drawing Subsystems FIG. 38 shows a multiplicity of Block Span FIFO Pixel Drawing Subsystems For the partial rendering systems of FIG. Temporal Correlation and the use of Tags Most applications of 3D computer graphics generate a sequence of scenes in a frame-by-frame manner. Applications can generate either an animation (done in non-real time) or a simulation (done in real time). If the frame rate of the sequence is sufficiently high (this is generally the case), then the present scene looks very much like the previous scene, and the only differences are due to movement of objects or light sources within the scene or movement of the viewing point. Thus, consecutive scenes are similar to each other due to their temporal correlation. For the hidden surface removal problem, identifying the non-occulted geometry from the previous frame can help with the rendering of the present scene because such non-occulted geometry can be rendered first. Then, when geometry which was occulted in the previous scene undergoes the Occulting Tests Taking advantage of temporal correlation is difficult in prior art rendering systems because the “backward link” from the final values in the Z-buffer (and frame buffer) back to the geometry database is difficult to construct. In other words, prior art rendering systems smash geometry into separate and independent pixels, and taking advantage of temporal correlation requires knowing which pieces of geometry generated the pixels which survived the keep/discard decisions when an entire scene has completed the rendering process. A simple solution to generating the “backward link” is to save, for every pixel in the display screen When Span Interpolation When a scene is completely rendered, the Tags are read, thereby identifying the geometry which was not occulted. Then, the geometry is rendered first in the next frame. However, for a display screen The tag reading and redundancy problems are solved by the Tag MCCAM Z-buffer Just before rendering of a scene is started, all the Tags stored in the Tag MCCAM Z-buffer As rendering proceeds, Tags are written into Tag-fields When the rendering of a scene is complete, and Tags of the visible geometry are stored in the set of Tag-fields Step 1: A Hit Flag Register Step 2: The Hit Flag Register Step 3: The main system processor (or the one which controls reading the geometry database) adds Step 4: In this highest priority Tag MCCAM Word Step 5: The Tag being read is on the BusTag Step 6: If, after matching Tags cause their Hit Flags to be cleared, a lower priority Tag MCCAM Word Step 7: If FIG. 42 shows a method flow diagram for rendering a sequence of scenes. Here If the scene has a significant chance of having high temporal correlation with the previous scene, then: 1) the Visible Object List is created If the scene does not have a significant chance of having high temporal correlation with the previous scene, the Visible Object List is not created processing proceeds as: 1)the Frame Buffer An alternate method of generating a new Visible Object List starts with the previous Visible Object List. An item from the previous Visible Object List is used as a search key for all the Tag-fields The above discussion assumes Tags are used in conjunction with an MCCAM Z-buffer In addition to helping with hidden surface removal, Tags can also be used to postpone computationally expensive lighting calculations until after all hidden surface removal is done. Once the entire geometry database has been traversed, and each pixel has a Tag which identifies the piece of geometry responsible for the pixel's final color, the lighting calculations can be done. For a sequence of scenes with high temporal correlation, Tags will accurately predict which geometry should be rendered first. Subsequent geometry will be mostly occulted, and therefore, will be filtered out before pixels need to be generated. As depth complexity of scenes increase, the total number of pixels needed to be generated will not increase dramatically. In fact, ideally, the amount of pixel generation is proportional to the number of pixels in the frame buffer and independent of the depth complexity of the scene or size of the geometry database. While this ideal situation may not actually be reached, the invention disclosed here is a dramatic improvement over prior art rendering systems. In prior art rendering systems, the amount of pixel generation is proportional to the depth complexity of the scene and size of the geometry database. Hence, prior art rendering systems require substantially more computation per scene than the system presented here. Organizing Pixels into Cells, or Blocks of Cells As described previously herein, one or more MCCAM Z-buffers FIG. 43 shows an example display screen Raster Cells are Cells which have pixels within only one raster line. FIG. 44 shows an example display screen FIG. 45 shows one Interleaved Block of Raster Cells, where the example display screen Using Cells reduces the number of required MCCAM Words When a piece of geometry is undergoing an Occulting Test When the piece of geometry fails the Occulting Test When the Span FIFO In general, the use of Cells requires a decimation of the two-dimensional display screen Using Tags and Cells When Cells are used, only one MCCAM Word As a Cell z-value is written into an MCCAM Word Rather than storing Tags in the MCCAM Z-buffer The Preferred Embodiment Many choices for the method and apparatus have been described in this document. While it is not possible to describe the best choices for all systems, the preferred embodiment described in this section is selected to be reasonably compatible with portions of existing high performance state-of-the-art real time rendering systems. Most rendering systems take polygons as their primary input. This makes the Vertex Bounding Box Occulting Test The Raster Write Span Occulting Test Blocks are included in the preferred embodiment because they provide a divide-and-conquer approach to the problem by splitting it up. Interleaved Blocks are best because they provide a mechanism for computation associated with a piece of geometry to be evenly spread over Block Span FIFO Pixel Drawing Subsystems Reading Hits are not included in the preferred embodiment because they (in general) reduce the throughput of the MCCAM Z-buffer The Span FIFO Cells are not included in the preferred embodiment because they require data to be read from a conventional Z-buffer Tags alter the way the hierarchical data structure is traversed, so Tags do not easily fit into existing rendering systems. Hence, when the method and apparatus of this document are incorporated into an existing rendering system, the preferred embodiment does not include the use of Tags, and therefore uses the Vertex MCCAM Z-buffer In summary, the preferred embodiment consists of: 1) the Dual Occulting Test 3D Graphics Pipeline If Tags are included, then items 5 and 6 in the preceding paragraph are replaced by the following: 5) the Tag MCCAM Z-buffer Another good choice for an embodiment would be to eliminate any approximations for the values stored in the MCCAM Z-buffer Hardware Implementation The MCCAM Z-buffer The Duluk Patent includes the Duluk Patent MCCAM Bit Circuit The Duluk Patent MCCAM Bit Circuit FIG. 48 shows the MCCAM Bit Circuit A MCCAM Bit Circuit A FIG. 49 shows MCCAM Bit Circuit B FIG. 49 shows MCCAM Bit Circuit C FIG. 51 shows the Infinity Flag Bit Circuit FIG. 52 shows a Z-field A Z-field Since the values in all the X-fields In contrast, the Duluk Patent provides writing of all fields within a word. This introduces a problem in assigning particular pixel locations to specific words. If writable X-fields As an example, FIG. 54 shows a circuit Since MCCAM Words Another economy can be achieved by realizing X-fields FIG. 56 shows a prior art CAM Bit Circuit FIG. 57 shows the Tag Bit Circuit FIG. 58 shows a Tag Invalid Big Circuit FIG. 59 shows a Tag-field Tag-fields Appendix 1: PriorArtBuffer This is pseudocode for the prior art Z-buffer method, slightly modified from that found in the Foley Reference. Appendix 2: Global const, type, and var definitions This pseudocode defines constants, data types, and global variables which are used in subsequent method pseudocode. Appendix 3: BoundingBoxZBuffer Method pseudocode “BoundingBoxZBuffer” compares the closest point (that is, minimum z-value) of a graphics primitive to all MCCAM Z-buffer Appendix 4: BoundingBoxZBufferInfinity Method pseudocode “BoundingBoxZBufferInfinity” adds an Infinity Flag Appendix 5: BoundingBoxZBufferRasterWrite In method pseudocode “BoundingBoxZBufferRasterWrite”, the Raste Write operation is used to store approximate z-values into the MCCAM Z-buffer Appendix 6: RenderSceneRecursive Method pseudocode “RenderSceneRecursive” renders a scene of hierarchically organized, where the depth of subobjects can be arbitrarily deep because of the recursive nature of the method pseudocode RenderObjectRecursive A Appendix 7: BoundingBoxZBufferVertex Method pseudocode “BoundingBoxZBufferVertex” operates on polygon meshes. For each pixel in the display screen Appendix 8: BoundingBoxZBufferVertexPixelHits Method pseudocode “BoundingBoxZBufferVertexPixelHits” uses the VBBOT Appendix 9: BoundingBoxZBufferVertexSegmentHits Method pseudocode “BoundingBoxZBufferVertexSegmentHits” uses the VBBOT Appendix 10: BoundingBoxZBufferVertexRasterHits Method pseudocode “BoundingBoxZBufferVertexRasterHits” uses the VBBOT Appendix 11: BoundingBoxZWithDualOccultingTest Method pseudocode “BoundingBoxZWithDualOccultingTest” operates on polygon meshes using the VBBOT Appendix 12: BoundingBoxZWithDualOccultingTestAndSpanFifo Method pseudocode “BoundingBoxZWithDualOccultingTestAndSpanFifo” processes polygon meshes and spans in the same manner as the method in Appendix 11 except with the addition of the Span FIFO Appendix 13: BoundingBoxZWithDualOccultingTestAndBlockSpanFifo Method pseudocode “BoundingBoxZWithDualOccultingTestAndBlockSpanFifo” processes polygon meshes and spans in the same manner as the method in Appendix 12 except the display screen Appendix 14: BlockVertexComparisons Method pseudocode “BlockVertexComparisons” generates comparison Results Appendix 15: RenderSequenceWithTags Method pseudocode “RenderSequenceWithTags” renders a sequence of scenes, each scene with multiple hierarchical objects. As objects are rendered, each object's Tag, or identification number, is stored in each pixel it effects. When subsequent scenes are to be rendered, if there is enough temporal coherency, the Tags of visible objects are read into a Visible Object List. The objects in this list are rendered first, resulting in a partially rendered scene where most of the visible surfaces have been rendered, and most of the yet-to-be-rendered objects are occulted. Before these yet-to-be-rendered objects are rendered, they are subjected to the VBBOT Appendix 16: BoundingBoxZBufferWithDualOccultingTestTags Method pseudocode “BoundingBoxZBufferWithDualOccultingTestTags” uses the VBBOT
Patent Citations
Non-Patent Citations
Referenced by
Classifications
Legal Events
Rotate |