CROSS REFERENCES TO RELATED APPLICATIONS

[0001]
This application claims the benefit of U.S. Provisional application Ser. No. 60/250,823 filed on Dec. 1, 2000 titled “Multiple Processor Visibility Search System and Method”.

[0002]
This application is a continuationinpart of U.S. Pat. application Ser. No. 09/247,466 filed on Feb. 9, 1999 titled “VisibleObject Determination For Interactive Visualization”, which claims the benefit of U.S. Provisional application Ser. No. 60/074,868 filed on Feb. 17, 1998 titled “VisibleObject Determination for Interactive Visualization”.
BACKGROUND OF THE INVENTION

[0003]
1. Field of the Invention

[0004]
The present invention relates generally to the field of computer graphics, and more particularly, to the problem of determining the set of objects (and portions of objects) visible from a defined viewpoint in a graphics environment.

[0005]
2. Description of the Related Art

[0006]
Visualization software has proven to be very useful in evaluating threedimensional designs long before the physical realization of those designs. In addition, visualization software has shown its cost effectiveness by allowing engineering companies to find design problems early in the design cycle, thus saving them significant amounts of money. Unfortunately, the need to view more and more complex scenes has outpaced the ability of graphics hardware systems to display them at reasonable frame rates. As scene complexity grows, visualization software designers need to carefully use the rendering resource provided by graphic hardware pipelines.

[0007]
A hardware pipeline wastes rendering bandwidth when it discards rendered triangle work. Rendering bandwidth waste can be decreased by not asking the pipeline to draw triangles that it will discard. Various software methods for reducing pipeline waste have evolved over time. Each technique reduces waste at a different point within the pipeline. As an example, software culling of objects falling outside the view frustum can significantly reduce discards in a pipeline's clipping computation. Similarly, software culling of backfacing triangles can reduce discards in a pipeline's lighting computation.

[0008]
The zbuffer is the final part of the graphics pipeline that discards work. In essence, the zbuffer retains visible surfaces, and discards those not visible because they are behind another surface (i.e. occluded). As scene complexity increases, especially in walkthrough and CAD environments, the number of occluded surfaces rises rapidly and as a result the number of surfaces that the zbuffer discards rises as well. A frame's average depth complexity determines roughly how much work (and thus rendering bandwidth) the zbuffer discards. In a frame with a perpixel depth complexity of d the pipeline's effectiveness is 1/d. As depth complexity rises, the hardware pipeline thus becomes proportionally less and less effective.

[0009]
Software occlusion culling has been proposed as an additional tool for improving rendering effectiveness. A visualization program which performs occlusion culling effectively increases the overall rendering bandwidth of the graphics hardware by not asking the hardware pipeline to draw occluded objects. Computing a scene's visible objects is the complementary problem to that of occlusion culling. Rather than removing occluded objects from the set of objects in a scene or frustumculled scene, a program instead computes which objects are visible and instructs the rendering hardware to draw just those. A simple visualization program can compute the set of visible objects and draw those objects from the current viewpoint, thus allowing the pipeline to focus on removing backfacing polygons and the zbuffer to remove any nonvisible surfaces of those objects.

[0010]
One technique for computing the visible object set uses ray casting as shown in FIG. 1. RealEyes [Sowizral, H. A., Zikan, K., Esposito, C., Janin, A., Mizell, D., “RealEyes: A System for Visualizing Very Large Physical Structures”, SIGGRAPH '94, Visual Proceedings, 1994, p. 228], a system that implemented the ray casting technique, was demonstrated in SIGGRAPH 1994's BOOM room. At interactive rates, visitors could “walk” around the interior of a Boeing 747 or explore the structures comprising Space Station Freedom's lab module.

[0011]
The intuition for the use of rays in determining visibility relies on the properties of light. The first object encountered along a ray is visible since it alone can reflect light into the viewer's eye. Also, that object interposes itself between the viewer and all succeeding objects along the ray making them not visible. In the discrete world of computer graphics, it is difficult to propagate a continuum of rays. So a discrete subset of rays is invariably used. Of course, this implies that visible objects or segments of objects smaller than the resolution of the ray sample may be missed and not discovered. This is because rays guarantee correct determination of visible objects only up to the density of the raysample. FIG. 1 illustrates the raybased method of visible object detection. Rays that interact with one or more objects are marked with a dot at the point of their first contact with an object. It is this point of first contact that determines the value of the screen pixel corresponding to the ray. Also observe that the object 10 is small enough to be entirely missed by the given ray sample.

[0012]
Visibleobject determination has its roots in visiblesurface determination. Foley et al. [Foley, J., van Dam, A., Feiner, S. and Hughes, J. Computer Graphics: Principles and Practice, 2nd ed., AddisonWesley, Chapter 15, pp. 649718, 1996] classify visiblesurface determination approaches into two broad groups: imageprecision and objectprecision algorithms. Image precision algorithms typically operate at the resolution of the display device and tend to have superior performance computationally. Object precision approaches operate in object space—usually performing object to object comparisons.

[0013]
A prototypical imageprecision visiblesurfacedetermination algorithm casts rays from the viewpoint through the center of each display pixel to determine the nearest visible surface along each ray. The list of applications of visiblesurface ray casting (or ray tracing) is long and distinguished. Appel [“Some Techniques for Shading Machine Rendering of Solids”, SJCC'68, pp. 3745, 1968] uses ray casting for shading. Goldstein and Nagel [Mathematical Applications Group, Inc., “3D Simulated Graphics Offered by Service Bureau,” Datamation, 13(1), February 1968, p. 69.; see also Goldstein, R. A. and Nagel, R., “3D Visual Simulation”, Simulation, 16(1), pp. 2531, 1971] use ray casting for boolean set operations. Kay et al. [Kay, D. S. and Greenberg, D., “Transparency for Computer Synthesized Images,” SIGGRAPH'79, pp. 158164] and Whitted [“An Improved Illumination Model for Shaded Display”, CACM, 23(6), pp. 343349, 1980] use ray tracing for refraction and specular reflection computations. Airey et al. [Airey, J. M., Rohlf, J. H. and Brooks, Jr. F. P., “Towards Image Realism with Interactive Update Rates in Complex Virtual Building Environments”, ACM SIGGRAPH Symposium on Interactive 3D Graphics, 24, 2(1990), pp. 4150] uses ray casting for computing the portion of a model visible from a given cell.

[0014]
Another approach to visiblesurface determination relies on sending beams or cones into a database of surfaces [see Dadoun et al., “Hierarchical approachs to hidden surface intersection testing”, Proceeedings of Graphics Interface '82, Toronto, May 1982, 4956; see also Dadoun et al., “The geometry of beam tracing”, In Joseph O'Rourke, ed., Proceeedings of the Symposium on Computational Geometry, pp. 5561, ACM Press, New York, 1985]. Essentially, beams become a replacement for rays. The approach usually results in compact beams decomposing into a set of possibly nonconnected cone(s) after interacting with an object.

[0015]
A variety of spatial subdivision schemes have been used to impose a spatial structure on the objects in a scene. The following four references pertain to spatial subdivision schemes: (a) Glassner, “Space subdivision for fast ray tracing,” IEEE CG&A, 4(10):1522, Oct. 1984; (b) Jevans et al., “Adaptive voxel subdivision for ray tracing,” Proceedings Graphics Interface '89, 164172, June 1989; (c) Kaplan, M. “The use of spatial coherence in ray tracing,” in Techniques for Computer Graphics . . . , Rogers, D. and Earnshaw, R. A. (eds), SpringerVerlag, New York, 1987; and (d) Rubin, S. M. and Whitted, T. “A 3dimensional representation for fast rendering of complex scenes,” Computer Graphics, 14(3):110116, July 1980.

[0016]
Kay et al. [Kay, T. L. and Kajiya, J. T., “Ray Tracing Complex Scenes”, SIGGRAPH 1986, pp. 269278, 1986], concentrating on the computational aspect of ray casting, employed a hierarchy of spatial bounding volumes in conjunction with rays, to determine the visible objects along each ray. Of course, the spatial hierarchy needs to be precomputed. However, once in place, such a hierarchy facilitates a recursive computation for finding objects. If the environment is stationary, the same datastructure facilitates finding the visible object along any ray from any origin.

[0017]
Teller et al. [Teller, S. and Sequin, C. H., “Visibility Preprocessing for Interactive Walkthroughs,” SIGGRAPH '91, pp. 6169] use preprocessing to full advantage in visibleobject computation by precomputing celltocell visibility. Their approach is essentially an object precision approach and they report over 6 hours of preprocessing time to calculate 58 Mbytes of visibility information for a 250,000 polygon model on a 50 MIP machine [Teller, S. and Sequin. C. H., “Visibility computations in polyhedral threedimensional environments,” U. C. Berkeley Report No. UCB/CSD 92/680, April 1992].

[0018]
In a different approach to visibility computation, Greene et al. [Greene, N., Kass, M., and Miller, G., “Hierarchical zBuffer Visibility,” SIGGRAPH '93, pp. 231238] use a variety of hierarchical data structures to help exploit the spatial structure inherent in object space (an octree of objects), the image structure inherent in pixels (a Z pyramid), and the temporal structure inherent in framebyframe rendering (a list of previously visible octree nodes). The Zpyramid permits the rapid culling of large portions of the model by testing for visibility using a rapid scan conversion of the cubes in the octree.

[0019]
As used herein, the term “octree” refers to a data structure derived from a hierarchical subdivision of a threedimensional space based on octants. The threedimensional space may be divided into octants based on three mutually perpendicular partitioning planes. Each octant may be further partitioned into eight suboctants based on three more partitioning planes. Each suboctant may be partitioned into eight sub suboctants, and so forth. Each octant, suboctant, etc., may be assigned a node in the data structure. For more information concerning octrees, see pages 550555, 559560 and 695698 of Computer Graphics: principles and practice, James D. Foley et al., 2^{nd }edition in C, ISBN 0201848406, T385.C5735, 1996.

[0020]
The depth complexity of graphical environments continues to increase in response to consumer demand for realism and performance. Thus, the efficiency of an algorithm for visible object determination has a direct impact on the marketability of a visualization system. The computational bandwidth required by the visible object determination algorithm determines the class of processor required for the visualization system, and thereby affects overall system cost. Thus, a system and method for improving the efficiency of visible object determination is greatly desired.
SUMMARY OF THE PRESENT INVENTION

[0021]
Various embodiments of a system and method for performing visible object determination based upon a dual search of a cone hierarchy and a bound hierarchy are herein disclosed. In one embodiment, the system may comprise a plurality of processors, a display device, a shared memory, and optionally a graphics accelerator. The multiple processors execute a parallel visibility algorithm which operates on a collection of graphical objects to determine a visible subset of the objects from a defined viewpoint. The objects may reside in a threedimensional space and thus admit the possibility of occluding one another.

[0022]
The parallel visibility algorithm represents space in terms of a hierarchy of cones emanating from a viewpoint. In one embodiment, the leafcones of the cone hierarchy, i.e. the cones at the ultimate level of refinement, subtend an area which corresponds to a fraction of a pixel in screen area. For example, two cones may conveniently fill the area of a pixel. In other embodiments, a leafcone may subtend areas which include one or more pixels.

[0023]
An initial view frustum or neighborhood of the view frustum may be recursively tessellated (i.e. refined) to generate a cone hierarchy. Alternatively, the entire space around the viewpoint may be recursively tessellated to generate the cone hierarchy. In this embodiment, the cone hierarchy is recomputed for changes in the viewpoint and viewdirection.

[0024]
The multiple processors or some subset thereof, or another set of one or more processors, may also generate a hierarchy of bounds from the collection of objects. In particular, the bound hierarchy may be generated by: (a) recursively grouping clusters starting with the objects themselves as orderzero clusters, (b) bounding each object and cluster (of all orders) with a corresponding bound, e.g. a polytope hull, (c) allocating a node in the bound hierarchy for each object and cluster, and (d) organizing the nodes in the bound hierarchy to reflect cluster membership. For example if node A is the parent of node B, the cluster corresponding to node A contains a subcluster (or object) corresponding to node B. Each node stores parameters which characterize the bound of the corresponding cluster or object.

[0025]
The cone hierarchy and bound hierarchy may be stored in the shared memory. In addition, the shared memory may store a global problem queue. The global problem queue is initially loaded with a collection of boundcone pairs. Each boundcone pair points to a bound in the bound hierarchy and a cone in the cone hierarchy.

[0026]
The multiple processors may couple to the shared memory, and may perform a search of the cone and bound hierarchies to identify one or more nearest objects for a subset of cones (e.g. the leaf cones) in the cone hierarchy. After the multiple processors complete the search of the cone and bound hierarchies, a transmission agent (e.g. the multiple processors, some subset thereof, or another set of one or more processors) may transmit graphics primitives, e.g. triangles, corresponding to the nearest objects of each cone in the subset, to a rendering agent. The rendering agent (e.g. the graphics accelerator, or a software renderer executing on the multiple processors, some subset thereof, or another set of one or more processors) is operable to receive the graphics primitives, to perform rendering computations on the graphics primitives to generate a stream of pixels, and to transmit the pixel stream to the display device.

[0027]
In some embodiments, each leafcone may be assigned a visibility distance value which represents the distance to the closest known object as perceived from within the leafcone. Each leafcone may also be assigned an object pointer which specifies the closest known object within view of the leafcone. Similarly, each nonleaf cone may be assigned a visibility distance value. However, the visibility distance value of a nonleaf cone may be set equal to the maximum of the visibility distance values for its subcone children. This implies that the visibility distance value for each nonleaf cone equals the maximum of the visibility distance values of its leafcone descendents.

[0028]
In one embodiment, each of the plurality of processors is operable to: (a) read a boundcone pair (H,C) from the global work queue, (b) compute the distance between the bound H and the cone C, (c) to compare the boundcone distance to a visibility distance associated with the cone C, (d) to write two or more dependent boundcone pairs to the global problem queue if the boundcone distance is smaller than the visibility distance of the cone C. The two or more dependent boundcone pairs may be pairs generated from bound H and the subcones of cone C, or pairs generated from cone C and subbounds of bound H.

[0029]
Furthermore, when the processor detects that the hull H is a leaf bound of the bound hierarchy and the cone C is a leaf cone of the cone hierarchy, the processor may update the visibility information for the leaf cone, i.e. may set the visibility distance value for cone C equal to the conehull distance computed in (b) above, and may set the nearest object pointer associated with cone C equal to a pointer associated with hull H.

[0030]
In one alternative embodiment, each processor may couple to a local memory containing a local problem queue. Each processor may read and write boundcone pairs from/to its local problem queue, and access the global problem queue to read initial boundcone pairs.

[0031]
In another alternative embodiment, a collection of cones may be selected from the cone hierarchy, i.e. a collection of nonoverlapping cones which fill the space of the root cone (i.e. top level cone). The cones of the collection may be distributed among the multiple processors. Each of the multiple processors may perform a search of its assigned cones (i.e. the subtrees of the cone hierarchy defined by these assigned cones) against the hull tree.
BRIEF DESCRIPTION OF THE FIGURES

[0032]
The foregoing, as well as other objects, features, and advantages of this invention may be more completely understood by reference to the following detailed description when read together with the accompanying drawings in which:

[0033]
[0033]FIG. 1 illustrates the raybased method of visible object detection according to the prior art;

[0034]
[0034]FIG. 2A illustrates one embodiment of a graphical computing system for performing visible object determination;

[0035]
[0035]FIG. 2B is a block diagram illustrating one embodiment of the graphical computing system 80;

[0036]
[0036]FIG. 3 is a flowchart for processing operations performed in one embodiment of graphical computing system 80;

[0037]
[0037]FIG. 4A illustrates a collection of objects in a graphics environment;

[0038]
[0038]FIG. 4B illustrates a first step in one embodiment of a method for forming a hull hierarchy, i.e. the step of bounding objects with containing hulls and allocating hull nodes for the containing hulls;

[0039]
[0039]FIG. 4C illustrates one embodiment of the process of grouping together hulls to form higher order hulls, and allocating nodes in the hull hierarchy which correspond to the higher order hulls;

[0040]
[0040]FIG. 4D illustrates a final stage in the recursive grouping process wherein all objects are contained in a universal containing hull which corresponds to the root node of the hull hierarchy;

[0041]
[0041]FIG. 5A illustrates the mathematical expressions which describe lines and halfplanes in two dimensional space;

[0042]
[0042]FIG. 5B illustrates the description of a rectangular region as the intersection of four halfplanes in a two dimensional space;

[0043]
[0043]FIG. 6 illustrates a twodimensional cone C partitioned into a two subcones C_{1 }and C_{2 }which interact with a collection of objects;

[0044]
[0044]FIG. 7 illustrates polyhedral cones with rectangular and triangular crosssection emanating from the origin of a threedimensional space;

[0045]
[0045]FIG. 8A illustrates mathematical expressions which describe a line through the origin and a corresponding halfplane given a normal vector in twodimensional space;

[0046]
[0046]FIG. 8B illustrates the specification of a twodimensional conic region as the intersection of two halfplanes;

[0047]
FIGS. 9A9C illustrate the formation of a cone hierarchy based on repeated subdivision of an initial cone with rectangular crosssection;

[0048]
[0048]FIG. 10A illustrates one embodiment of a program thread 250 which is executed by each of multiple processors to accomplish a dual search of the hull hierarchy and cone hierarchy;

[0049]
[0049]FIG. 10B illustrates an embodiment of graphical computing system 80 where each of multiple processors reads initial hullcone pairs from a global problem queue, and accesses noninitial hullcone pairs from a corresponding local queue;

[0050]
[0050]FIG. 10C illustrates an embodiment of graphical computing system 80 where cones from the cone hierarchy are distributed among a plurality of processors, and each processor searches the assigned cones (and their descendents) with respect to the hull hierarchy;

[0051]
[0051]FIG. 10D illustrates a cone C which has a small normalized size compared to a bound hull H;

[0052]
[0052]FIG. 10E illustrates a hull H which has a small normalized size compared to a cone C;

[0053]
[0053]FIG. 11 illustrates one embodiment of the process of recursively clustering a collection of objects to form a bounding hierarchy.

[0054]
While the invention is susceptible to various modifications and alternative forms, specific embodiments thereof are shown by way of example in the drawings and will herein be described in detail. It should be understood, however, that the drawings and detailed description thereto are not intended to limit the invention to the particular forms disclosed, but on the contrary, the intention is to cover all modifications, equivalents and alternatives falling within the spirit and scope of the present invention as defined by the appended claims. Please note that the section headings used herein are for organizational purposes only and are not meant to limit the description or claims. The word “may” is used in this application in a permissive sense (i.e., having the potential to, being able to), not a mandatory sense (i.e., must). Similarly, the word include, and derivations thereof, are used herein to mean “including, but not limited to.”
DETAILED DESCRIPTION OF SEVERAL EMBODIMENTS

[0055]
[0055]FIG. 2A presents one embodiment of a graphical computing system 80 for performing visible object determination. Graphical computing system 80 may include a system unit 82, and a display device 84 coupled to the system unit 82. The display device 84 may be realized by any of various types of video monitors or graphical displays. Graphics computer system 80 may include one or more input devices such as a keyboard 86, a mouse 88, a trackball, a digitizing pad, a joystick, etc.

[0056]
[0056]FIG. 2B is a block diagram illustrating one embodiment of graphical computing system 80. Graphical computing system 80 may include a plurality of processors PR_{1 }through PR_{M }and a shared memory 106 coupled to a highspeed system bus 104. Graphical computing system 80 may also include a graphics accelerator 112 coupled to system bus 104 and display device 84.

[0057]
Each processor PR_{I }may couple to a dedicated local memory (not shown) for storing local code and/or local data. (The notation PR_{I }refers to an arbitrary one of the processors PR_{1 }through PR_{M}.) The shared memory 106 may include any of various types of memory subsystems including random access memory, read only memory, and/or mass storage devices. Processors PR_{1 }through PR_{M }operate on a set of objects to determine a subset of the objects which are visible from a particular viewpoint in a threedimensional scene. Each object in the original set may comprise a collection of graphics primitives (e.g. triangles). In one embodiment, objects may be described in terms of a system of equations and/or geometric constraints, e.g. polynomial equations. In this case, the visible objects may need to be decomposed (i.e. partitioned) into graphics primitives (e.g. tessellated into triangles) prior to pixel rendering and display. The object decomposition may be performed by processors PR_{1 }through PR_{M}, some subset thereof, and/or by a second set of one or more processors (not shown).

[0058]
Graphics primitives (e.g. triangles) corresponding to the visible objects may be transmitted to graphics accelerator 112 for rendering and display on display device 84. Since graphics accelerator 112 operates on primitives corresponding to the visible objects, a higher percentage of rendered pixels (or supersamples) survive the zcomparison than if the graphics accelerator 112 were supplied with primitives corresponding to the fall object set. In other words, the rendering hardware in graphics accelerator 112 may operate with increased efficiency.

[0059]
In one alternative embodiment, processors PR_{1 }through PR_{M}, or some subset thereof, and/or another set of one or more processors (not shown) may perform pixel rendering computations on the graphics primitives corresponding to the visible objects, and may generate a stream of pixels which are transmitted to display device 84 for image display. In this alternative embodiment, graphics accelerator 112 may not be included in graphics computing system 80.

[0060]
In one embodiment, graphics accelerator 112 comprises a plurality of graphics processors, and these graphics processors may perform the visible object determination instead of processors PR_{1 }through PR_{M}. In this embodiment, graphics accelerator 112 may receive a set of objects (or pointers to the objects) in a 3D scene. The graphics processors may operate on the set of objects to determine the subset of visible objects with respect to a current viewpoint in the 3D scene. In another embodiment, processors PR_{1 }through PR_{M }and graphics processor in the graphics accelerator may cooperate to determine the set of visible objects.

[0061]
As mentioned above, 3D graphics accelerator 112 may couple to system bus 104, and display device 84 may couple to graphics accelerator 112. 3D graphics accelerator 112 may be a specialized graphics rendering subsystem which is designed to offload the 3D rendering functions from the host system 82, thus providing improved system performance. It is assumed that various other peripheral devices, or other buses, may be connected to system bus 104, as is well known in the art. If 3D accelerator 112 is not included in graphical computing system 80, display device 84 may couple directly to system bus 104.

[0062]
Processor devices (e.g. processors PR_{1 }through PR_{M}) coupled to system bus 104 may transfer information to and from graphics accelerator 112 according to a programmed input/output (1/0) protocol over the system bus 104. In one embodiment, graphics accelerator 112 may access system memory 106 according to a direct memory access (DMA) protocol or through intelligent bus mastering. In another embodiment, graphics accelerator 112 may couple to system memory 106 through an Advanced Graphics Port connection. Processors PR_{1 }through PR_{M }may operate under the control of visualization software stored in shared memory 106 and/or the local memories of the individual processors.

[0063]
[0063]FIG. 3 is a flowchart for one embodiment of the processing performed by graphics computing system 80 in response to the visualization software. In an initial step 210, the graphics computing system 80 may receive a plurality of objects and construct an object hierarchy from the plurality of objects. (The object hierarchy construction is discussed in more detail below). It is noted that the object hierarchy may have been precomputed, in which case step 210 may be skipped.

[0064]
In step 220, graphical computing system 80 may discover the set of visible objects in the scene with respect to a current viewpoint. In the preferred embodiment, graphical computing system 80 may be configured to compute visibility for threedimensional objects from a view point in a threedimensional coordinate space. However, the methodologies herein described naturally generalize to spaces of arbitrary dimension.

[0065]
In one embodiment of graphical computing system 80, the viewpoint and view direction in the graphical environment may be changed in response to user input. For example, by manipulating mouse 88, depressing keys on keyboard 86, manipulating a joystick or game control pad, the user may cause the viewpoint and/or view direction to change. Thus, graphical computing system 80 may recompute the set of visible objects whenever the viewpoint and/or the view orientation changes. Furthermore, it is quite often the case that objects may move within the 3D scene. Thus, graphical computing system 80 may recomputed the set of visible objects when the objects in the scene move.

[0066]
In step 225, graphics computing system 80 may transmit graphics primitives (e.g. triangles) corresponding to the visible objects to a rendering agent for pixel rendering.

[0067]
In step 230, the rendering agent may perform rendering computations on the graphics primitives to generate a stream of pixels. In one embodiment, the rendering agent may be graphics accelerator 112. In another embodiment, the rendering agent may be a software renderer running on one or more processors (e.g. processors PR_{1 }through PR_{M }or some subset thereof) configured within graphical computing system 80.

[0068]
In step 230, the rendering agent may transmit the pixel stream to display device 84 for image display.

[0069]
Visible object determination step 220 may be performed repeatedly as the viewpoint and/or view direction (i.e. orientation) changes, and/or as the objects themselves evolve in time.

[0070]
In some embodiments, objects may be modeled as opaque convex polytopes. A threedimensional solid is said to be convex if any two points in the solid (or on the surface of the solid) may be connected with a line segment which resides entirely within the solid. Thus a solid cube is convex, while a donut (i.e. solid torus) is not. A polytope is an object with planar sides (e.g. cube, tetrahedron, etc.). The methodologies described herein for opaque objects naturally extend to transparent or semitransparent objects by not allowing such objects to terminate a cone computation. Although not all objects are convex, every object can be approximated as a union of convex polytopes. It is helpful to note that the visibleobjectset computation does not require an exact computation, but rather a conservative one. In other words, it is permissible to estimate a superset of the set of visible objects.

[0071]
Constructing the Object Hierarchy

[0072]
Initially, the objects in a scene may be organized into a hierarchy that groups objects spatially. An octree is one possibility for generating the object hierarchy. However, in the preferred embodiment, a clustering algorithm is used which groups nearby objects then recursively clusters pairs of groups into larger containing spaces. The clustering algorithm employs a simple distance measure and threshold operation to achieve the object clustering. FIGS. 4A4D illustrate one embodiment of a clustering process for a collection of four objects J00 through J11. The objects are indexed in a fashion which anticipates their ultimate position in a binary tree of object groups. The objects are depicted as polygons situated in a plane (see FIG. 4A). However, the reader may imagine these objects as arbitrary threedimensional objects. In one embodiment, the objects are threedimensional polytopes.

[0073]
Each object may be bounded, i.e. enclosed, by a corresponding bounding surface referred to herein as a bound. In the preferred embodiment, the bound for each object is a polytope hull (i.e. a hull having planar faces) as shown in FIG. 4B. The hulls H00 through H11 are given labels which are consistent with the objects they bound. For example, hull H00 bounds object J00. The hulls are illustrated as rectangles with sides parallel to a pair of coordinate axes. These hulls are intended to represent rectangular boxes (parallelepipeds) in three dimensions whose sides are normal to a fixed set of coordinate axes. For each hull a corresponding node data structure is generated. The node stores parameters which characterize the corresponding hull.

[0074]
Since a hull has a surface which is comprised of a finite number of planar components, the description of a hull is intimately connected to the description of a plane in threespace. In FIG. 5A, a two dimensional example is given from which the equation of an arbitrary plane may be generalized. A unit vector n [any vector suffices but a vector of length one is convenient for discussion] defines a line L through the origin of the two dimensional space. By taking the dot product v•n of a vector v with the unit vector n, one obtains the length of the projection of vector v in the direction defined by unit vector n. Thus, given a real constant c, it follows that the equation x•n=c, where x is a vector variable, defines a line M perpendicular to line L and situated at a distance c from the origin along line L. In the context of threedimensional space, this same equation defines a plane perpendicular to the line L, again displaced distance c from the origin along line L. Observe that the constant c may be negative, in which case the line (or plane) M is displaced from the origin at distance c along line L in the direction opposite to unit vector n.

[0075]
The line x•n=c divides the plane into two halfplanes. By replacing the equality in the above equation with an inequality, one obtains the description of one of these halfplanes. The equality x•n<c defines the halfplane which contains the negative infinity end of line L. [The unit vector n defines the positive direction of line L.] In three dimensions, the plane x•n=c divides the threedimensional space into two halfspaces. The inequality x•n<c defines the halfspace which contains the negative infinity end of line L.

[0076]
[0076]FIG. 5B shows how a rectangular region may be defined as the intersection of four halfplanes. Given four normal vectors n_{1 }through n_{4}, and four corresponding constants c_{1 }through c_{4}, a rectangular region is defined as the set of points which simultaneously satisfy the set of inequalities x•n_{i}<C_{i}, where i ranges from one to four. This system of inequalities may be summarized by the matrixvector expression N•x<c, where the rows of matrix N are the normal vectors n_{1 }through n_{4}, and the components of vector c are the corresponding constants c_{1 }through c_{4}. If the normal vectors are chosen so as to lie in the positive and negative axial directions (as shown in FIG. 5B), the resulting rectangular region has sides parallel to the axes. It is noted that the rectangular hulls H00 through H11 shown in FIG. 4B all use a common set of normal vectors. Thus, each hull is characterized by a unique c vector.

[0077]
In threedimensional space, a rectangular box may be analogously defined as the intersection of six halfspaces. Given six normal vectors n_{1 }through n_{6}, oriented in each of the three positive and three negative axial directions, and six corresponding constants c_{1 }through c_{6}, the simultaneous solution of the inequalities x•n_{i}<c_{i}, where i runs from one to six, defines a rectangular box with sides parallel to the coordinate planes. Thus, a rectangular box may be compactly represented with the same matrixvector expression Nx<c, where matrix N now has six rows for the six normal vectors, and vector c has six elements for the six corresponding constants.

[0078]
To construct an object hierarchy, object hulls H00 through H11 are paired together as shown in FIG. 4C. Each pair of object hulls is bounded by a containing hull. For example, hulls H00 and H01 are paired together and bounded by containing hull H0. Containinghull H0 contains the two component hulls H00 and H01. Likewise, object hulls H10 and H11 are paired together and bounded by containinghull H1. In addition, two parent nodes are generated in the object hierarchy, one for each of the containinghulls H0 and H1. For simplicity, the parent nodes are commonly labeled as their corresponding containinghulls. Thus, parent node H0 points to its children nodes H00 and H01, while parent node H1 points to its children nodes H10 and H11. Each parent node contains the characterizing c vector for the corresponding containinghull.

[0079]
The containinghulls H0 and H1 may be referred to as first order containinghulls since they are the result of a first pairing operation on the original object hulls. A second pairing operation is applied to the firstorder containing hulls to obtain secondorder containinghulls. Each secondorder containinghull contains two firstorder hulls. For each of the secondorder containinghulls a parent node is generated in the object hierarchy. The parent node reflects the same parentchild relationship as the corresponding secondorder containinghull. For example, in FIG. 4D, secondorder containinghull H contains firstorder containinghulls H0 and H1. Thus, parent node H in the object hierarchy points to children nodes H0 and H1. Parent node H stores the characterizing vector c for the containinghull H. In the example presented in FIGS. 4A4D, the object hierarchy is complete after two pairing operations since the original object collections contained only four objects.

[0080]
In general, a succession of pairing operations is performed. At each stage, a higherorder set of containinghulls and corresponding nodes for the object hierarchy are generated. Each node contains the describing vector c for the corresponding containinghull. At the end of the process, the object hierarchy comprises a binary tree with a single root node. The root node corresponds to a total containinghull which contains all subhulls of all orders including all the original objecthulls. The object hierarchy, because it comprises a hierarchy of bounding hulls, will also be referred to as the hull hierarchy. In the preferred embodiment, the pairing operations are based on proximity, i.e. objects (and hulls of the same order) are paired based on proximity. Proximity based pairing may result in a more efficient visible object determination algorithm. This tree of containing hulls provides a computationally efficient and hierarchical representation of the entire scene. For instance, when a cone completely misses a node's containinghull, none of the node's descendents need to be examined.

[0081]
Bounding hulls (i.e. containing hulls) serve the purpose of simplifying and approximating objects. Any hierarchy of containing hulls works in principle. However, hierarchies of hulls based on a common set of normal vectors are particularly efficient computationally. A collection of hulls based on a common set of normal vectors will be referred to herein as a fixeddirection or commonlygenerated collection. As described above, a polytope hull is described by a bounding system of linear inequalities {x: Nx≲c}, where the rows of the matrix N are a set of normal vectors, and the elements of the vector c define the distances to move along each of the normal vectors to obtain a corresponding side of the polytope. In a fixeddirection collection of hulls, the normal matrix N is common to all the hulls in the collection, while the vector c is unique for each hull in the collection. The problem of calculating the coefficient vector c for a containing hull given a collection of subhulls is greatly simplified when a common set of normal vectors is used. In addition, the nodes of the hull hierarchy may advantageously consume less memory space since the normal matrix N need not be stored in the nodes. In some embodiments, the hull hierarchy comprises a fixeddirection collection of hulls.

[0082]
In a first embodiment, six normal vectors oriented in the three positive and three negative axial directions are used to generate a fixeddirection hierarchy of hulls shaped like rectangular boxes with sides parallel to the coordinate planes. These axisaligned bounding hulls provide a simple representation that has excellent local computational properties. It is easy to transform or compare two axisaligned hulls. However, the approximation provided by axisaligned hulls tends to be rather coarse, often proving costly at more global levels.

[0083]
In a second embodiment, eight normal vectors directed towards the corners of a cube are used to generate a hierarchy of eightsided hulls. For example, the eight vectors (±1,±1,±1) may be used to generate the eightsided hulls. The octahedron is a special case of this hull family.

[0084]
In a third embodiment, fourteen normal vectors, i.e. the six normals which generate the rectangular boxes plus the eight normals which generate the eightsided boxes, are used to generate a hull hierarchy with fourteensided hulls. These fourteensided hulls may be described as rectangular boxes with corners shaved off. It is noted that as the number of normal vectors and therefore side increases, the accuracy of the hull's approximation to the underlying object increases.

[0085]
In a fourth embodiment, twelve more normals are added to the fourteen normals just described to obtain a set of twentysix normal vectors. The twelve additional normals serve to shave off the twelve edges of the rectangular box in addition to the corners which have already been shaved off. This results in twentysix sided hulls. For example, the twelve normal vectors (±1, ±1, 0), (±1, 0, ±1), and (0, ±1, ±1) may be used as the additional vectors.

[0086]
In the examples given above, hulls are recursively grouped in pairs to generate a binary tree. However, in other embodiments, hulls are grouped together in groups of size n_{G}, where n_{G }is larger than two. In one embodiment, the group size may vary from group to group.

[0087]
Although the above discussion has focused on the use of polytope hulls as bounds for object and clusters, it is noted that any type of bounding surfaces may be used, thereby generating a hierarchy of bounds referred to herein as a bounding hierarchy. Each node of the bounding hierarchy corresponds to an object or cluster and stores parameters which characterize the corresponding bound for that object or cluster. For example, polynomial surfaces such as quadratic surfaces may be used to generate bounds for objects and/or clusters. Spheres and ellipsoids are examples of quadratic surfaces.

[0088]
cones in visible object determination

[0089]
In addition to the bounding hierarchy (e.g. hull hierarchy) discussed above, the visualization software makes use of a hierarchy of spatial cones. An initial cone which may represent the view frustum may be recursively subdivided into a hierarchy of subcones. Then a simultaneous double recursion may be performed through the pair of trees (the object tree and cone tree) to rapidly determine the set of visible objects. This conebased method provides a substantial computational gain over the prior art method based on raycasting.

[0090]
[0090]FIG. 6 illustrates a twodimensional cone C in a twodimensional environment. Cone C is defined by the region interior to the rays R1 and R2 (and inclusive of those rays). Cone C is partitioned into two subcones denoted C0 and C1. The ambient space is populated with a collection of twodimensional objects OBJ1 through OBJ8. Each of the twodimensional hulls is bounded by a corresponding rectangular hull. Six of the objects are visible with respect to cone C, i.e. objects OBJ1, OBJ2, OBJ3, OBJ4, OBJ7 and OBJ8. Objects OBJ1, OBJ2 and OBJ4 are visible with respect to subcone Cl, and objects OBJ3, OBJ7 and OBJ8 are visible with respect to subcone C2. Object OBJ5 is occluded by objects OBJ2 and OBJ4, and object OBJ6 is occluded by objects OBJ3 and OBJ7.

[0091]
polyhedral cones

[0092]
The spatial cones used in the preferred embodiment are polyhedral cones. The generic polyhedral cone has a polygonal crosssection. FIG. 7 gives two examples of polyhedral cones. The first polyhedral cone PHC1 has a rectangular crosssection, while the second polyhedral cone PHC2 has a triangular crosssection. The view frustum is a cone with rectangular crosssection like cone PHC1. Polyhedral cones may be defined by homogeneous linear inequalities. Given a normal vector n, the equation n•x=0 involving vector argument x defines a plane passing through the origin and perpendicular to the normal vector n. This plane divides space into two halfspaces. The linear inequality n•x<0 defines the halfspace from which the normal vector n points outward. FIG. 8A gives a twodimensional example. As shown, the equation n•x=0 specifies the set of points (interpreted as vectors) which are perpendicular to normal n. This perpendicular line L divides the plane into two halfplanes. The halfplane defined by the inequality n•x<0 is denoted by shading. Observe that the normal vector n points out of this halfplane.

[0093]
A polyhedral cone is constructed by intersection of multiple halfspaces. For example, solid cone PHC2 of FIG. 7 is the intersection of three halfspaces. Similarly, solid cone PHC1 is the intersection of four halfspaces. FIG. 8B provides a twodimensional example of intersecting halfplanes to generate a conic area. The two normal vectors n_{1 }and n_{2}define perpendicular lines L_{1 }and L_{2 }respectively. The inequality n_{1}•x<0 specifies the halfplane which is southwest (i.e. left and below) of the line L_{1}. The inequality n_{2}•x<0 defines the halfplane which is to the right of line L_{2 }. The solution to the simultaneous system of inequalities n_{1}•x<0 and n_{2}x<0 is the intersection region denoted in shading. This system of inequalities may be summarized by the matrix equation Sx≲0, where the rows of matrix S are the normal vectors. From this discussion, it may be observed that solid cone PHC1 of FIG. 7 is determined by four normal vectors. The normal matrix S would then have four rows (for the four normal vectors) and three columns corresponding to the dimension of the ambient space.

[0094]
Thus, a polyhedral cone emanating from the origin is defined as the set of points satisfying a system of linear inequalities Sx≲0. [There is no loss of generality in assuming the origin to be the viewpoint.] According to this definition, halfspaces, planes, rays, and the origin itself may be considered as polyhedral cones. In addition, the entire space may be considered to be a polyhedral cone, i.e. that cone which is defined by an empty matrix S.

[0095]
distance measurement

[0096]
The distance of an object, hull, or bound from a particular viewpoint is defined to be the minimum distance to the object, hull, or bound from the viewpoint. So, assuming a viewpoint at the origin, the distance of the object, hull, or bound X from the viewpoint is defined as
$f\ue8a0\left(X\right)=\underset{x\in X}{\mathrm{min}}\ue89e\uf605x\uf606,$

[0097]
where ∥x∥ is the norm of vector x.

[0098]
Any vector norm may be chosen for the measurement of distance. In one embodiment, the Euclidean norm is chosen for distance measurements. The Euclidean norm results in a spherically shaped wavefront. More generally, a distance measurement f(X) may be based on any wavefront as long as the wavefront shape satisfies a mild “starshape” criterion, i.e. the entire boundary of the wavefront is unobstructed when viewed from the origin. All convex wavefronts satisfy this condition, and many nonconvex ones do as well. In general, the level curves of a norm are recommended as the wavefront shapes. From a computational standpoint, the spherical wavefront shape given by the L^{2 }norm, and the piecewiselinear wavefront shapes given by the L^{1}, and L^{∞} norms provide good choices for visibility detection. It is noted that a piecewiselinear approximation of such a norm may be used instead of the norm itself.

[0099]
cones and visibility

[0100]
Consider an arbitrary cone K emanating from the origin as a viewpoint. Define the distance of an object, hull, or bound X relative to the cone K as
${f}_{K}\ue8a0\left(X\right)=\underset{x\in X\bigcap K}{\mathrm{min}}\ue89e\uf605x\uf606,$

[0101]
where the symbol ∈ denotes set intersection. If the distance f_{K }is computed for each object X in a scene, the nearest object, i.e. the object which achieves a minimum distance value f_{K}, is at least partially visible with respect to cone K.

[0102]
As discussed above, the raybased methods of the prior art are able to detect objects only up the resolution of the ray sample. Small visible objects or small portions of larger objects may be missed entirely due to insufficient ray density. In contrast, cones can completely fill space. Thus, the conebased method disclosed herein may advantageously detect small visible objects or portions of objects that would be missed by a raybased method with equal angular resolution.

[0103]
generalized separation measurement

[0104]
For the purposes of performing a visibility search procedure, it is necessary to have a method for measuring the extent of separation (or conversely proximity) of objects, bounds, or hulls with respect to cones. There exists a great variety of such methods in addition to those based on minimizing vector norms defined above.

[0105]
In some embodiments, the separation between a set X and a cone K may be computed based on the model of wavefront propagation. A wavefront propagating internal to the cone from the vertex of the cone has a radius of first interaction with the set X. This radius of first interaction may provide a measurement value of the separation between the set X and the cone K. The wavefront may satisfy a mild “star shape” condition, i.e. the entire boundary of the wavefront is visible from the vertex of the cone.

[0106]
In one embodiment, the measurement value is obtained by computing a penalty of separation between the set X and the cone K. The penalty of separation may be evaluated by minimizing an increasing function of separation distance between the vertex of the cone K and points in the intersection of the cone K and set X. For example, any positive power of a vector norm gives such an increasing function.

[0107]
In another embodiment, the measurement value is obtained by computing a merit of proximity between the set X and the cone K. The merit of proximity may be evaluated by maximizing a decreasing function of separation distance between the vertex of the cone K and points in the intersection of the cone K and set X. For example, any negative power of a vector norm gives such a decreasing function of separation.

[0108]
a cone hierarchy

[0109]
In the preferred embodiment, visible objects are determined by operating on a hierarchy of cones in addition to the hierarchy of hulls described above. The class of polyhedral cones is especially well suited for generating a cone hierarchy. Polyhedral cones naturally decompose into polyhedral subcones by the insertion of one or more separating planes. The ability to nest cones into a hierarchical structure may allow a rapid examination of object visibility. As an example, consider two neighboring cones that share a common face. By taking the union of these two cones, a new composite cone is generated. The composite cone neatly contains its children, and is thus capable of being used in querying exactly the same space as its two children. In other words, the children cones share no interior points with each other and they completely fill the parent without leaving any empty space.

[0110]
A typical display and its associated view frustum has a rectangular crosssection. Various possibilities are contemplated for tessellating this rectangular crosssection to generate a system of subcones. For example, the rectangle naturally decomposes into four rectangular crosssections, or two triangular crosssections. Although these examples illustrate decompositions using regular components, irregular components may be used as well.

[0111]
FIGS. 9A9C illustrate a hierarchical decomposition of an initial view frustum C. FIG. 9A depicts the rectangular crosssection of the view frustum and its bisection into two cones with triangular crosssection, i.e. cones C0 and C1. The view frustum C corresponds to the root node of a cone tree. Cones and their corresponding nodes in the cone tree are identically labeled for simplicity. Each node of the cone tree stores the matrix S of normal vectors which generate the corresponding cone. The root node points to two children nodes corresponding to cones C0 and C1. FIG. 9B illustrates a second decomposition stage. Each of the cones C0 and C1 is bisected into two subcones (again with triangular crosssection). Cone C0 decomposes into the two subcones C00 and C01. Likewise, cone C1 is bisected into two subcones C10 and C11. Nodes are added to the cone tree to reflect the structure of this decomposition. The parentchild relation of nodes in the cone tree reflects the supersetsubset relation of the respective cones in space. FIG. 9C illustrates the pattern of successive cone bisections according to one embodiment. Each cone in the hierarchy may be decomposed into two subcones by means a bisecting plane. FIG. 9C illustrates several successive descending bisections which generate cones C0, C10, C110, and C1110, and so on. The initial cone C (i.e. the view frustum) may be decomposed to any desired resolution. In one embodiment, the bisections terminate when the resultant cones intercept some fraction of a pixel such as, for example, ½ a pixel. The corresponding terminal nodes of the cone tree are called leaves. Alternate embodiments are contemplated where the bisections terminate when the resultant leafcones intercept areas which subtend (a) a portion of pixel such as 1/N where N is a positive integer, or (b) areas including one or more pixels.

[0112]
The triangular hierarchical decomposition shown in FIGS. 9A9C has a number of useful properties. By decomposing the original rectangular cone based on recursive bisection, a binary tree of cones of arbitrary depth is generated. Triangular cones have the fewest sides making them computationally more attractive. In addition, triangular cones can also tessellate the entire space surrounding the viewpoint. Imagine a unit cube with viewpoint at the center. The root cone may be the entire space. The root cone may have six subcones which intercept the six corresponding faces of the cube. Thus, it is possible to create a hierarchical cone representation for the entire space surrounding the viewpoint.

[0113]
It is noted that any cone decomposition strategy may be employed to generate a cone hierarchy. In a second embodiment, the view frustum is decomposed into four similar rectangular cones; each of these subcones is decomposed into four more rectangular subcones, and so on. This results in a cone tree with fourfold branches.

[0114]
As used herein, a cone K is said to be a descendent of cone C when cone C contains cone K. Thus, all the cones beneath cone C in the cone hierarchy are said to be descendents of cone C.

[0115]
discovering the set of visible objects

[0116]
Once the hull hierarchy and the cone hierarchy have been constructed, the set of visible objects may be computed with respect to the current viewpoint. In one embodiment, the visible object set is repeatedly recomputed for a succession of viewpoints, viewing directions, video frames, etc. The successive viewpoints and/or viewing directions may be specified by a user through an input device such as a mouse, joystick, keyboard, trackball, headposition sensor, eyeorientation sensor, or any combination thereof. A visible object determination method may be organized as a simultaneous search of the hull tree and the cone tree. The search process may involve recursively performing hullcone queries. Given a cone node K and a hull node H, a hullcone query on cone K and hull H investigates the visibility of hull H and its descendent hulls with respect to cone K and its descendent cones. The search process has a computational complexity of order log M, where M equals the number of cone nodes times the number of hull nodes. In addition, many hullcone queries can occur in parallel allowing aggressive use of multiple processors in constructing the visibleobjectset.

[0117]
viewing the scene

[0118]
The set of visible objects from the current viewpoint may be rendered on one or more displays. Display rendering and visible object determination may be performed independently and concurrently. The display rendering and visible object determination may occur concurrently because the visibleobjectset remains fairly constant between frames in a walkthrough environment. Thus, the previous set of visible objects provides an excellent approximation to the current set of visible objects.

[0119]
managing the visibleobjectset

[0120]
The visualization software executing on graphical computing system 80 may manage the visibleobjectset. Over time, as an enduser navigates through a model, simply inserting objects into the visible object set would result in a visible object set that contains too many objects. Thus, the visualization process may remove objects from the visible object set when those objects no longer belong to the set—or soon thereafter. A variety of solutions to object removal are possible. One solution is based on object aging. The system removes any object from the visible object set that has not been rediscovered by the cone query within a specified number of redraw cycles.

[0121]
computing visibility using cones

[0122]
Substantial computational leverage may be provided by recursively searching the hierarchical tree of cones in conjunction with the hierarchical tree of hulls. Whole groups of cones may be tested against whole groups of hulls in a single query. For example, if a parent cone does not intersect a parent hull, it is obvious that no child of the parent cone can intersect any child of the parent hull. In such a situation, the parent hull and all of its descendants may be removed from further visibility considerations with respect to the parent cone.

[0123]
visibility search algorithm

[0124]
In one embodiment, processors PR_{1 }through PR_{M }may implement a recursive search of the two trees (the object tree and the cone tree) to assign visible objects to leaf cones of the cone tree. In an alternate embodiment, the recursive search of the two trees may be performed by processors configured within graphics accelerator 112.

[0125]
The recursive search of the two trees provides a number of opportunities for aggressive pruning of the search space. Central to the search is the objectcone distance measure defined above, i.e. given a cone K and an object (or hull) X, the objectcone distance is defined as
${f}_{K}\ue8a0\left(X\right)=\underset{x\in X\bigcap K}{\mathrm{min}}\ue89e\uf605x\uf606.$

[0126]
It is noted that this minimization is in general a nonlinear programming problem since the cones and object hulls are defined by constraint equations, e.g. planes in threespace. If the vector norm ∥x∥ is the L^{1 }norm (i.e. the norm defined as the sum of absolute values of the components of vector x), the nonlinear programming problem reduces to a linear programming problem. If the vector norm ∥x∥ is the Euclidean norm, the nonlinear programming problem reduces to a quadratic programming problem. Given a collection of objects, the object X which achieves the smallest distance f_{K}(X) with respect to cone K is closest to the cone's viewpoint, and therefore is at least partially visible.

[0127]
The recursive search may explore hullcone pairs starting with the pair defined by the root hull of the hull tree and the root cone of the cone tree (see FIGS. 4 and 9). The recursive search mechanism is built upon several basic elements. A distance measurement function computes the distance f_{K}(X) of a hull X from the viewpoint of a cone K as seen from within cone K. In other words, the distance measurement function determines the conerestricted distance to the hull X. In some embodiments, the minimization associated with evaluating the distance measurement function is implemented by solving an associated linear (or nonlinear) programming problem.

[0128]
To facilitate the search process, each leafcone, i.e. each terminal node of the cone tree, is assigned an extent value which represents its distance to the closest known objecthull as seen within the cone. [An objecthull is a hull that directly bounds an object. Objecthulls are terminal nodes of the hull tree.] Thus, this extent value may be referred to as the visibility distance. The visibility distance of a leafcone is nonincreasing, i.e. it decreases as closer objects (i.e. object hulls) are discovered in the search process. Visibility distances for all leafcones may be initialized to positive infinity. In addition to a visibility distance value, each leafcone node is assigned storage for a pointer which points to a currently visible object. This object pointer may be initialized with a reserved value denoted NO_OBJECT which implies that no object is yet associated with the leafcone. In another embodiment, the object pointer may be initialized with a reserved value denoted BACKGROUND which implies that a default scene background is associated with the leafcone.

[0129]
In addition, each nonleaf cone, i.e. each cone at a nonfinal refinement level, may be assigned an extent value which equals the maximum of the extent values of its subcones. Or equivalently, the extent value for a nonleaf cone may be set equal to the maximum of the visibility distance values of its leafcone descendents. These extent values are also referred to as visibility distance values. The visibility distance values for all nonleaf cones are also initialized to positive infinity (consistent with the initialization of the leafcones). Suppose a given nonleaf cone K and a hull X achieve a coneobject distance f_{K}(X) . If this distance f_{K}(X) is greater than the visibility distance value of the cone K, then all of the leafcone descendents of cone K already have known objects closer than the hull H. Therefore, little or no benefit may be gained by searching hull H against cone K and its descendents. In contrast, if a hull H achieves a distance f_{K}(X) from cone K which is less than the visibility distance value of cone K, it is possible that hull H contains objects which will strictly decrease the visibility distance of some leafcone descendent of cone K. Thus, the hull H and its descendents may be searched against cone K and its descendents. The computation of the hullcone distance f_{K}(X) for cone K and hull H may be implemented by a software function Dist(H,K).

[0130]
A global problem queue may be maintained in shared memory 106. The global problem queue may store hullcone pairs (H,C) which are to be searched. In one embodiment, the global problem queue may initially contain one pair (H_{R},C_{R}) corresponding to the root hull H_{R }of the hull hierarchy and the root cone C_{R }of the cone hierarchy. Alternatively, the global problem queue may be initially loaded with all hullcone pairs of the form (H_{n,i},C_{m,j}), where hull H_{n,i }is a hull in layer n of the hull hierarchy, and cone C_{m,i }is a cone in layer m of the cone hierarchy, wherein n and m are integers greater than or equal to zero. Layer zero of the hull hierarchy corresponds to the root hull. Layer zero of the cone hierarchy corresponds to the root cone.

[0131]
For example, assuming the cone and hull hierarchies are binary trees, the second level below the root hull contains four hulls, and the third level below the root cone contains eight cones. Thus, the global problem queue may be initially loaded with the 32=4×8 hullcone pairs generated by these four hulls and eight cones.

[0132]
In addition, the hull hierarchy and the cone hierarchy may be stored in shared memory 106. Each processor PR_{1 }may cache portions of the hull hierarchy and cone hierarchy as needed.

[0133]
Each of processors PR_{1 }through PR_{M }may participate in a parallel search of the hull hierarchy and cone hierarchy. When a processor PR_{1 }becomes available, processor PR_{1 }(referred to as “the processor” in the following discussion) may perform computations on a hullcone pair as indicated in FIG. 10A. In step 260, the processor reads the global problem queue to obtain a hullcone pair (H,C). The hullcone pair (H,C) may comprise a pointer to hull H in the cone hierarchy and a pointer to cone C in the cone hierarchy. Furthermore, the processor may compute the hullcone distance d_{H,C }between hull H and cone C by invoking the function Dist. Alternatively, the code for computing the hullcone distance may be configured as inline code.

[0134]
In step 262, the processor may determine if the hullcone distance d_{H,C }is less than the visibility distance VSD_{C }of cone C. If the hullcone distance d_{H,C }is not less than the visibility distance VSD_{C }of cone C, the processor may terminate processing as indicated in step 265.

[0135]
If, in step 262, the hullcone distance d_{H,C }is determined to be less than the visibility distance VSD_{C }of cone C, step 266 may be performed.

[0136]
In step 266, the processor may determine if hull H and cone C are both leaves of their respective hierarchies. If hull H and cone C are both leaf nodes, step 268 may be performed.

[0137]
In step 268, the processor may update the visibility data of the leaf cone C. For example, the processor may set the visibility distance VSD_{C }of the leaf cone C equal to the hullcone distance d_{H,C}, and the object pointer OBJ_{C }for leaf cone C equal to the leaf hull H, or a pointer to leaf hull H, or a pointer to an object contained within leaf hull H.

[0138]
If hull H and cone C are not both leaves, the processor may perform step 270. In step 270, the processor may determine if hull H is a leaf of the hull hierarchy. If hull H is a leaf of the hull hierarchy, the processor may perform step 272.

[0139]
In step 272, the processor may write to the global problem queue two hullcone pairs, i.e. pair (H,C_{I}) corresponding to hull H and a first subcone C_{I }of cone C, and pair (H,C_{2}) corresponding to hull H and a second subcone C_{2 }of cone C. More generally, for cone hierarchies which allow more than two subcones per cone, the processor may write to the global problem queue all pairs of the form (H,C_{I}) where C_{I }is a subcone of cone C. After writing the hullcone pairs to the global problem queue, the processor may terminate processing.

[0140]
If, in step 270, the processor determines that hull H is not a leaf hull, step 274 may be performed. In step 274, the processor may compute a normalized size SizeH for hull H and a normalized size SizeC for cone C, and may compare SizeH and SizeC. A variety of methods are contemplated for computing the hull size and cone size. If SizeH is smaller than SizeC, step 276 may be performed.

[0141]
In step 276, the processor may write a hullcone pair (H_{I},C) to the global problem queue for each subhull H1 of hull H, i.e. all pairings of cone C with the subhulls of hull H. After writing the hullcone pairs to the global problem queue, the processor may terminate processing, e.g., may enter an idle state.

[0142]
If, in step 274, the processor determines that SizeH is not less than SizeC, the processor may perform step 278. In step 278, the processor may write a hullcone pair (H,C_{I}) to the global problem queue for each subcone C_{I }of cone C, i.e. all pairings of hull H with the subcones of cone C. After writing the hullcone pairs to the global problem queue, the processor may terminate processing.

[0143]
When the processor terminates processing as indicated in steps 265, 268, 272, 276 and 278, it becomes available for processing another hullcone pair. Thus, the processor may immediately reexecute the program thread described in FIG. 10A, and thereby, access and operate on the next available hullcone pair from the global problem queue.

[0144]
Thus, each of processors PR_{1 }through PR_{M }may cooperate in the effort of searching the cone hierarchy and hull hierarchy. Each execution of the program thread 250 of FIG. 10A by one of processors PR_{1 }through PR_{M }consumes a hullcone pair (H,C) from the global problem queue, and may add to the global problem queue (a) two or more hullcone pairs of the form (H_{I},C) where H_{I }runs through the subhulls of hull H, or (b) two or more hullcone pairs of the form (H,C_{I}) where C_{I }runs through the subcones of cone C. In the case where the hullcone distance d_{H,C }is not less than the visibility distance VSD_{C }of cone C, new hullcone pairs are not added to the global problem queue. When a hullcone pair (H,C) corresponds to a leaf hull and a leaf cone, the visibility distance and object pointer of the leaf cone are updated, and no new hullcone pairs are added to the global problem queue.

[0145]
In one embodiment, each processor PR_{I }executes a local copy of program thread 250 stored in a local memory dedicated for processor PR_{I}. Thus, processor PR_{I }need not compete with other processors to access shared memory 106 for program code.

[0146]
As described above, graphical computing system 80 may use a single global problem queue. All initial problems, i.e. hullcone pairs, may be loaded into the global problem queue.

[0147]
The operation of all processors repeatedly executing program thread 250 as they become available achieves the determination of the set of visible objects. When the global problem queue is empty and all processors have terminated, the set of visible objects will have been identified. The nearest object pointer for each leaf cone of the cone hierarchy defines the nearest object visible with respect to the leaf cone, i.e. as seen within the leaf cone. Because neighboring leaf cones may share the same nearest objects, the object pointers associated with the leaf cones of the cone hierarchy may be processed to remove any redundancies before transmission to the rendering agent.

[0148]
Local Problem Queue Per Processor

[0149]
In some embodiments, graphical computing system 80 includes a series of local problem queues, i.e. one local problem queue per processor and a main problem queue. The main problem queue may reside in shared memory 106 and stores the initial hullcone pairs. The local memory of each processor stores one of the local problem queues.

[0150]
Each processor PR_{I }may initially reside in an idle state 305 as shown in FIG. 10B. If the main problem queue is nonempty, processor PR_{I }may transition to state 310. In state 310, processor PR_{I }may read an initial hullcone pair from the main problem queue, and operate on the initial hullcone pair as described in connection with program thread 250 above with the exception that any hullcone pairs generated in response to the initial hullcone pair (by program thread 250) are written to the local problem queue of processor PR_{I }instead of main problem queue. After executing the program thread 250 on the initial hullcone pair, processor PR_{I }may transition to state 315.

[0151]
In state 315, processor PR_{I }may read a hullcone from its local problem queue, and operate on the hullcone pair as described above in connection with problem thread 250 again with the exception that any hullcone pairs generated in response to the received hullcone pair may be stored in the local problem queue. Processor PR_{I }may repeatedly visit state 315 as long as the local problem queue is nonempty. When the local problem queue of processor PR_{I }becomes empty, processor PR_{I }may return to the idle state 305.

[0152]
Because each processor operates from its local problem queue and accesses the main problem queue when its local problem queue is empty, memory access conflicts to the main problem queue may be minimized.

[0153]
In one embodiment, if the main problem queue is empty and the local problem queue of a processor PR_{I }is empty, processor PR_{I }may read a hullcone pair from the local problem queue of another processor PR_{J}, thus balancing the load between processors.

[0154]
After the main problem queue and all local problem queues are empty, the set of visible objects will have been determined.

[0155]
Distributing Cones to Processors

[0156]
In some embodiments, a collection of cones from the cone hierarchy may be distributed to processors PR_{1 }through PR_{M}. The cones comprising the collection preferably are nonoverlapping and fill up the space of the root cone. For example, the cone collection may be the complete set of cones at some K^{th }level below the root cone, where K is a positive integer. Alternatively, the cone collection may comprise cones from various levels of the cone hierarchy.

[0157]
Each cone from the cone collection is assigned to one and only one of the processors PR_{1 }through PR_{M}. In other words, in this embodiment, a cone from the cone collection is not assigned to multiple processors. For each cone C assigned to processor PR_{I}, processor PR_{I }(or some external agent) may load an initial hullcone pair (H_{R},C) corresponding to the root hull and cone C into its local problem queue.

[0158]
Processor PR_{I }may operate as described in FIG. 10C. Processor PR_{I }may initially reside in an idle state 320. In response to the local queue being nonempty, processor PR_{I }may transition to state 330.

[0159]
In state 330, processor PR_{I }may read a hullcone pair, e.g. one of the initial hullcone pairs (H_{R},C), from the local problem queue, and operate on the hullcone pair as described in connection with program thread 250 above with the exception that any responsively generated hullcone pairs are written to the local problem queue. After executing the program thread 250 on a given hullcone pair, processor PR_{I }may return to state 330 to read and operate on another hullcone pair from the local problem queue. Processor PR_{I }may revisit state 330 as long as the local problem queue remains nonempty. When the local problem queue is empty, processor PR_{I }may return to idle state 320.

[0160]
size conditioned tree search

[0161]
As described above in connection with step 274, processor PR_{I }determines a normalized size Size_C for cone C and a normalized size Size_H for hull H. In one embodiment, Size_H may be computed by dividing a solid diameter (or the square of a solid diameter) of hull H by the distance d_{H,C }of hull H with respect cone C. Size_C may be determined by computing the solid angle subtended by cone C. Size_C may also be determined by computing the cone's cross sectional area at some convenient distance (e.g. distance one) from the viewpoint. The cross section may be normal to an axis of the cone C. The cone size Size_C for each cone in the cone hierarchy may be computed when the cone hierarchy is generated (e.g. at system initialization time).

[0162]
If the hull size Size_H is larger than the cone size Size_C as suggested by FIG. 10D, on average, the probability of at least one subhull of hull H having an empty intersection with cone C is larger than the probability of at least one subcone of cone C having an empty intersection with hull H. Thus, in this case, it may be more advantageous to explore the subhulls H0 and H1 of hull H with respect to cone C, rather than exploring the subcones of cone C with respect to hull H. For example, FIG. 10D illustrates an empty intersection between subhull H0 and cone C. This implies that none of the descendents of subhull H0 need to be searched against any of the descendents of cone C.

[0163]
If the hull size Size_H is smaller than the cone size Size_C as suggested by FIG. 10E, on average, the probability of at least one subhull of hull H having an empty intersection with cone C is smaller than the probability of at least one subcone of cone C having an empty intersection with hull H. Thus, in this case, it may be more advantageous to explore the subcones C0 and C1 of cone C with respect to hull H, rather than exploring the subhulls of hull H with respect to cone C. For example, FIG. 10E illustrates an empty intersection between subcone C0 and hull H. This implies that none of the descendents of subcone C0 need to be searched against any of the descendents of hull H.

[0164]
By selecting the larger entity (hull or cone) for refinement, the program thread 250 may more effectively prune the combined hullcone tree, and determine the set of visible objects with increased efficiency.

[0165]
method for constructing a bounding hierarchy

[0166]
[0166]FIG. 11 illustrates the construction of a bounding hierarchy (i.e. a bounding tree structure) from a collection of objects. The collection of objects may be accessed from memory 106. In step 602, the objects in the graphics scene may be recursively clustered. Objects may be assembled into clusters preferably based on proximity. These first order clusters are themselves assembled into second order clusters. Clusters of successively higher order are formed until all the objects are contained in one universal cluster. Objects may be considered as order zero clusters. In step 604, each cluster of all orders is bounded with a corresponding bound. The bounds are preferably polytope hulls as described above in connection with FIGS. 4 and 5. However, other types of bounds are contemplated such as, e.g., quadratic surfaces, generalized polynomial bounds, etc.

[0167]
In step 606, a hierarchical tree of bounds is generated by allocating a node for each of the objects and clusters. In step 608, each node is assigned parameters which describe (characterize) the corresponding bound. In one embodiment this parameter assignment comprises storing the extent vector c which locates the polytope hull faces as described in connection with FIGS. 5A and 5B. In step 610, the nodes are organized so that node relationships correspond to cluster membership. For example, if node A is the parent of node B in the bounding hierarchy, then the cluster corresponding to node A contains a subcluster corresponding to node B, and the bound for node A contains the bound for node B.

[0168]
Although the construction of the cone hierarchy above has been described in terms of recursive clustering, it is noted alternative embodiments are contemplated which use other forms of clustering such as iterative clustering.

[0169]
computing the cone restricted distance function

[0170]
Recall that evaluation of the hullcone distance f_{c}(H) of a hull H from a cone C calls for minimizing ∥x∥ subject to the hull constraints Ax≲b and the cone constraints Sx≲0. The rows of matrix A comprise normals for the hull surfaces. The rows of matrix S comprise normals for the cone surfaces. This minimization may be formulated as a nonlinear programming problem. For example, the nonlinear programming problem reduces to a quadratic programming problem when a Euclidean norm is used, and a linear programming problem when the L^{1 }norm is used. The hullcone distance computation is herein referred to as a geometric query.

[0171]
It is also noted that hullcone separation may be measured by maximizing an decreasing function separation such as ∥x∥^{−1 }for points x satisfying the bound/hull constraints and the cone constraints. Thus, in general a hullcone separation value may be computed by determining an extremal (i.e. minimal or maximal) value of the separation function subject to the cone constraints and the bound/hull constraints.

[0172]
The use of a hierarchy of cones instead of a collection of rays is motivated by the desire for computational efficiency. Thanks to early candidate pruning that results from the double recursion illustrated earlier, fewer geometric queries are performed. These queries however are more expensive than the queries used in the ray casting method. Therefore, the cone query calculation may be designed meticulously. A sloppy algorithm could end up wasting most of the computational advantage provided by improvements in the dual tree search. For the linear programming case, a method for achieving a computationally tight query will now be outlined.

[0173]
A piecewiselinear formulation of distance f_{c }leads to the following linear program:

min(v ^{T} x)

[0174]
subject to

Ax≲b, Sx≲0.

[0175]
The vector v is some member of the cone that is polar to the cone C. For instance, v=S^{T}e, where e is the vector of all ones. [It is noted that the matrix S of cone normals S are outward normals to the cone surfaces. Thus, the negation of the sum of the normal vectors gives a polar vector.] The condition Ax≲b implies that the point x is within the bounding hull. The condition Sx≲0 implies that the point x is within the cone C. For an efficient solution method, the linear program problem is restated in term of its dual:

max(b ^{T} y)

[0176]
subject to

A ^{T} y+S ^{T} z=v,0≲y,0≲z.

[0177]
The dual objective value, b^{T}y is infinite when the cone and bounding hull do not intersect (the variables y and z are the Lagrange multipliers of the previous problem's constraints).

[0178]
In the preferred embodiment, the bounding hulls have sides normal to a fixed set of normal vectors. Thus, the matrix A^{T }is the same for all hulls. For a given cone, the matrix S^{T }and the vector v are also fixed. From this observation, it is apparent that the multidimensional polyhedron

{(y,z):A ^{T} y+S ^{T} z=v,0≲y,0≲z}

[0179]
is associated with the cone. (In one embodiment, this polyhedron has seventeen dimensions. Fourteen of those dimensions come from the type of the fixeddirection bounding hull and an three additional dimensions come from the cone.) Since the polyhedron depends only on the cone matrix S, it is feasible to completely precompute the extremal structure of the polygon for each cone in the cone hierarchy. By complementary slackness, the vertices of the polyhedron will have at most three elements. The edges and extremal rays will have at most four nonzero elements. An abbreviated, simplexbased, hillclimbing technique can be used to quickly solve the query in this setting.

[0180]
In one embodiment, the entire space is tessellated with cones, and visible objects are detected within the entire space. After this entirespace visibility computation, the set of visible objects may be culled to conform to the current view frustum, and the visible objects which survive the frustum culling may be rendered and displayed.

[0181]
In an alternative embodiment, a less aggressive approach may be pursued. In particular, by determining beforehand a collection of the cones in the cone hierarchy which correspond to the view frustum in its current orientation, only this collection may be included in the visibleobjectset computation.

[0182]
memory media

[0183]
As described above, the visibility software realized by program thread 250 may be stored in shared memory 106 and/or the local memories of processors PR_{1 }through PR_{M}. In addition, the visibility software may be stored in any desired memory media such as an installation media (e.g. CDROM, floppy disk, etc.), a nonvolatile memory (e.g. hard disk, optical storage, magnetic tape, bubble memory, ROM, etc.), various kinds of volatile memory such as RAM, or any combination thereof. In some embodiments, the visibility software may be deposited on memory media for distribution to end users and/or customers. Also, the visibility software may be transmitted through a transmission medium (e.g. the atmosphere and/or free space, a network of computers, an electrical conductor, optical fiber, etc.) between an information source and destination.

[0184]
In one embodiment, the visibility software may be implemented as part of an operating system. In a second embodiment, the visibility software may be implemented as a dynamic link library. In a third embodiment, the visibility software may be implemented as part of a device driver (e.g. a device driver for graphics accelerator 112).

[0185]
In a fourth embodiment, the visibility software may be implemented as part of a JAVA 3D virtual machine which executes on processors PR_{1 }through PR_{M}. A user may access a remote server through a network. The server responsively generates a stream of graphics data comprising graphical objects. The visibility software executing as part of the JAVA 3D virtual machine may determine a set of visible objects from the graphical objects. The virtual machine may provide the set of visible objects (or pointers to the visible objects) to a rendering agent. The rendering agent may be a hardware rendering unit such as graphics accelerator 112. Alternatively, the rendering agent may be a software renderer.