« PreviousContinue »
METHODS AND APPARATUS FOR PARTIAL AND CONSISTENT MONITORING OF OBJECT-ORIENTED PROGRAMS AND SYSTEMS
FIELD OF THE INVENTION
The present invention generally relates to computer programming and, in particular, to monitoring object-oriented programs.
BACKGROUND OF THE INVENTION
Monitoring is that activity where a program execution environment reports to external listener subsystems what is happening during the program execution. Monitoring enables many different tools for performing! different tasks. Examples of such tools are profiling tools for performance enhancements, tracing tools for program understanding, or debugger tools for debugging. Monitoring is generally costly because: (i) amount of information produced is fairly large; (ii) the overhead of monitoring is high; or (iii) a combination of both. Consequently, many systems support partial monitoring where only a subset of the information is produced. There are many different ways for expressing which subset is of interest. However, no matter what conventional method is used to partially monitor a system, the resulting partial monitoring can result in the generation of an inconsistent subset of the complete information. Such an inconsistent subset of information can lead to incorrect interpretation of the program execution behavior by the various information processing tools and lead to erroneous conclusions.
As a result, there exists a need for partial monitoring methods and apparatus which, when incorporated into a program execution environment, result in the generation of consistent information regarding program execution without limiting the possible kinds of information filtering criteria. Accordingly, the consistent information so produced could be used by tools such as, for example, program analyzers or program visualizers to correctly interpret the program execution and solve performance, program understanding and correctness problems.
SUMMARY OF THE INVENTION
The present invention provides methods and apparatus for partial monitoring which, when incorporated into a program execution environment, result in the generation of consistent information regarding program execution without limiting the possible kinds of information filtering criteria.
In one aspect of the invention, a method of monitoring events generated by an object-oriented system comprises the steps of: (i) monitoring events which describe executed operations associated with the object-oriented system; and (ii) applying one or more sequencing rules when reporting a subset of the monitored events, the one or more sequencing rules substantially ensuring consistent reporting of the subset of monitored events. Preferably, monitoring continues when event reporting is at least partially disabled. Further, the monitoring step may include dividing the monitored events into categories. One category may include entity events, an entity event defining an existence status (e.g., object creation, object reclamation, etc.) of a given event. Another category may include activity events, an activity event defining an operation associated with a given event. Still further, the entity events and activity events may be
further divided into at least one of an object event category, an execution event category, a type event category and a synchronization event category. The sequencing rules are applied to maintain substantial consistency with respect to
5 information associated with the categories.
In one embodiment, the sequencing rules may specify that: (i) an entity event reporting creation of an entity precede an event reporting reclamation of the entity; (ii) an activity event referring to an entity appear between respec
1° tive entity events reporting creation and reclamation of the entity; (iii) an entity is not reclaimed without reporting its reclamation; (iv) an invocation is not reported without its associated parent being reported; and (v) synchronization events maintain correct semantics.
15 Advantageously, the methodologies of the invention may produce substantially consistent information in the context of partial reporting (i.e., reporting a subset of the events being monitored) with respect to a program being executed by the object-oriented system. Selection of the subset of
20 information to be reported may be made using either dynamic or static filtering criteria. The information may be used by tools such as, for example, program analyzers or program visualizers to correctly interpret the program execution and solve performance, program understanding
25 and correctness problems.
It is to be appreciated that the term "partial monitoring" as used in accordance with the invention may be thought of as referring to the perspective of the external listener (e.g.,
30 tool). That is, while a subset of the monitored events are reported to the listener in accordance with the one or more sequencing rules, the methodology of the present invention preferably monitors substantially all events associated with the object-oriented system. However, due to the subset
35 reporting, it appears to the external listener that the listener is partially monitoring the object-oriented system. As mentioned, one advantage over the prior art is that the sequencing rules substantially ensure consistent reporting of the subset of monitored events.
40 These and other objects, features and advantages of the present invention will become apparent from the following detailed description of illustrative embodiments thereof, which is to be read in connection with the accompanying drawings.
45 BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a block diagram illustrating an illustrative program execution and event monitoring system in which the invention may be employed; 50 FIG. 2 is a visual representation illustrating objectoriented program execution through time;
FIG. 3 is a diagram illustrating potential infinite tree generation;
55 FIG. 4 is a flow diagram illustrating a report event tree generation methodology according to one embodiment of the invention;
FIG. 5 is a flow diagram illustrating a consistent monitoring methodology according to one embodiment of the 60 invention; and
FIG. 6 is a block diagram illustrating an exemplary computer system for implementing the invention.
DETAILED DESCRIPTION OF PREFERRED
As mentioned, monitoring is that activity where a running system reports to external listeners what is happening inside
the system. A system usually reports its activity through events. Depending on the object-oriented system that is considered, these events can be different in nature. For example, a program execution environment can report events related to the execution of the program. The events 5 reported can be read accesses to data structures, write accesses to data structures, the beginning of execution of a method, the end of execution of a method, etc. A high-level block diagram illustrating the context of the invention is shown in FIG. 1. As shown, a program 2 is executed within 10 a program execution environment 4. The environment 4 includes a program execution subsystem 6 and an integrated monitoring subsystem (agent) 8. The program execution subsystem 6 feeds the monitoring agent 8 different types of events and event information regarding program execution. 15 The monitoring agent portion of the program execution environment reports events in a form that is readable and understandable by external listener subsystems such as, for example, program event processing tools 10. The event reporting may be controlled by an event reporting controller 20 12. That is, the controller 12 directs the monitoring agent 8 what to report and whether to report (i.e., reporting on) or not report (i.e., reporting off) information to the processing tool 10. It is to be appreciated that this invention deals with particular methodologies of developing and providing such 25 an integrated monitoring subsystem.
Each system designer decides what monitoring capabilities needs to be included in a system. Because monitoring is usually very expensive, many systems provide some ways of subsetting the reported events as users see fit. This encom- 30 passes many well-known subsets. For instance, this includes start-stop tracing where the monitoring is stopped and then later resumed. The starts and stops may be user-driven or they can be timer driven (sampling). Partial monitoring also includes more advanced filtering where only some events 35 are generated based on some filtering criteria such as events related to a particular event type or component.
If partial monitoring allows to cut the monitoring overhead to a practical level in most cases, it also makes it likely, when using conventional partial monitoring techniques, to 40 introduce inconsistencies in the reported event stream. To understand how such inconsistencies may be introduced, one needs to gain a basic understanding what we mean by an object-oriented system in the context of programming languages. A class or type is the building block of an object- 45 oriented language and is a template that describes the data and behavior associated with instances of that class. When one instantiates a class, an object that is created looks and feels like other instances of the same class. The data associated with a class or an object is stored in member variables. 50 The behavior associated with a class or object is implemented with methods. For example, a rectangle can be considered a class that has two attributes or member variables: length and breadth, and a method named area. From this rectangle class, one can create or instantiate new 55 instances or new rectangle objects. These new objects share similar behavior with respect to the attributes and methods that can be invoked on them. However, the actual values of the attributes and the value returned by the methods can be different. Method invocations are executed in the context of 60 a thread of execution.
During the execution of an object-oriented program, potentially many objects are created, manipulated, and returned to system storage when they are no longer needed. This return to system storage can be automatic or through an 65 explicit return-to-storage operation by the program. We refer herein to both types of return as reclaiming or reclamation.
An object-oriented program achieves its task by potentially creating many objects of different classes and invoking methods on them.
Consider a visual representation of an object-oriented program execution through time as shown in FIG. 2. For the sake of discussion, let us assume that the identity of objects is nothing but the memory address to which they have been allocated. During its execution, the program creates 5 objects, OBJ1 through OBJ5, and invokes methods on these objects. However, OBJ1 and OBJ5 have the same identity since OBJ5 is allocated at the same address as OBJ1 after OBJ1 has been reclaimed. Since reporting was turned off during the time the OBJ1 was reclaimed and OBJ5 was created, a program understanding tool can read the monitoring events generated by a conventional monitoring system and mistakenly associate the costs and method invocations to the wrong objects. The consequence is that tools have to shield themselves from these inconsistencies, either by ignoring them and presenting potentially erroneous results or by implementing costly detection and prevention mechanisms.
The present invention provides an automated methodology for avoiding inconsistencies, while not limiting filtering mechanisms of partial monitoring. The invention is applicable to any system implementing some form of monitoring. The invention will be explained in the context of a monitoring API (application programming interface) relying on events to report internal activity to external listener subsystems. However, the invention is not so limited. That is, the invention may also, for example, apply to the internal design of systems that embed one or more kinds of listener subsystems.
The present invention dictates consistency rules regarding the generation of events; but not necessarily which events should be generated and not what kind of partial monitoring is permissible. A basic requirement is that any generated event belong to one and only one of the two following categories: (i) entity events; and (ii) activity events.
Entity events are defined as events which report the creation/existence of an entity in the system as well as its death/reclamation. Activity events are defined as events which, on the contrary, only report that something is happening on one or more existing entities, that is, no entity creation or reclamation is involved.
For instance, an event reporting the creation of an object is an entity event. An event reporting the invocation of a method on an object is also an entity event, for a stack frame has been created. However, an event reporting that an object has been moved in memory or that an object reference has been assigned to another object is an activity event.
It is also important to notice that, because we assume a programmatic monitoring API, entity events have to introduce an identity for the created entities, called the monitoring identity. The rationale for the identity is to allow other events to be able to refer to entities. For instance, an event reporting a method invocation needs to refer to the receiver object, the class implementing the method, and possibly even the thread on which the invocation occurs. Without monitoring identity, this would be impossible.
One advantage of this invention is that it works with no globally unique identity schemes. Such schemes are expensive to implement and most system designers prefer using "scoped" identity such as the memory address at which the entity has been allocated. By "scoped," we mean that the identity is only valid from the time the entity is created to the time it is reclaimed. The consequence is that the mapping
between an entity and its identity is time-scoped. Our only requirement is that entity events always report changes in the mapping.
To further explain what we mean by object-oriented monitoring through events, let us define in abstract terms the 5 concepts and events of a very typical object-oriented system. This definition applies to substantially any existing objectoriented system. However, it is to be appreciated that the invention is not intended to be limited by this abstract definition. An object-oriented system presents four basic 10 entities: (i) type; (ii) object; (iii) thread; and (iv) invocation. A type describes an object structure and specifies its behavior, i.e., the set of methods one may invoke on an object of that type. An invocation is the execution of a method on its receiver object. Each invocation is carried on 15 one and only one thread. Invocations nest, forming the thread execution stack.
Typically, such a system would have entity events to report the creation and reclamation of these entities. For instance, events reporting the creation and reclamation of an 2o object or a thread, events for reporting the loading and unloading of a type, and/or events for reporting the beginning or end of an invocation. These events may represent the core monitoring. That core would then typically be extended with extra activity events such as an event reporting that a 25 thread is suspended or resumed. Other events may be used to report object management activities such as compaction, probably yielding new monitoring identities for objects. Synchronization events, reporting the "enters" (entries) and "leaves" (departures) from a critical section, as well as 30 potential waits, would also be examples of activity events. A critical section is a set of program instructions that need to be executed as a unit with respect to the data structures they manipulate. In a system having a single "thread" of execution, achieving a critical section is trivial. But in 35 modern systems, many threads of execution are concurrently executed to improve functionality and processing time. It is quite possible these threads need to access and manipulate shared data structures. To maintain consistency, a lock structure is associated with the shared data and a thread 40 performs the operations in a critical section only after gaining ownership to the lock. Once the operations are completed, the thread releases the ownership of the lock. If it happens that another thread has lock ownership, then a thread desiring to acquire the lock waits on the owning 45 thread until the lock is free. A diagram showing the different threads in a system with arrows drawn from a thread waiting for a resource to the thread owning the resource is called a wait graph. Such a wait graph, with accurate information and some additional summary information, is an excellent 50 tool for determining sources of contention and reasons for infinite waiting of threads for resources.
In the presence of partial monitoring, inconsistencies in the above abstract system may appear under four major forms: (i) type inconsistencies; (ii) object inconsistencies; 55 (iii) execution inconsistencies; and (iv) synchronization inconsistencies.
Type inconsistencies may exist when types are confused because the correspondence between types and their identifier is not maintained accurately. Object inconsistencies 60 may exist when objects are confused because the correspondence between objects (type instances) and their identifier is not maintained accurately. Execution inconsistencies may exist when threads are confused, as well as method invocations, because the correspondence between threads 65 and their identifier is not maintained accurately, as well as when the thread stack nesting of methods is not maintained
accurately. Synchronization inconsistencies may exist when the ownership of lock and critical section is confused, as well as the wait graph between threads.
Each of these cases may provoke the erroneous reporting of potentially serious conditions from conventional monitoring tools that support some form of partial monitoring. Such erroneous reporting may include, for example, erroneous profiling information (accounting of times), erroneous object reclamation, erroneous memory leaks, or even false deadlocks. Unfortunately, there exist no comprehensive solutions for these and other problems in the prior art in the area of supporting partial and consistent monitoring for object-oriented systems. In general, the conventional program execution environments leave the burden of making sense of the information generation to the information processing tools. In the context of Java language (see, e.g., "Java Language Specification," J Gosling, B. Joy and G. Steele, Addison Wesley, ISBN 0201634511 (1996); and "Java 1.1 Developer's Handbook," P. Heller, S. Roberts, with P. Seymour and T. McGinn, Sybex, ISBN 0-7821-19190), the Java Virtual Machine (JVM) from Sun Microsystems exports an API called JVMPI (see, e.g./'Comprehensive profiling support in the Java Virtual Machine," Sheng Liang and Deepa Viswanathan. Usenix Conference on ObjectOriented Technologies (COOTS) 1999) that allows for callbacks into the JVM to handle certain inconsistencies related to the type system. However, it is known to be incomplete. In particular, the trace generation can be incorrect in the presence of garbage collection. In the context of the C language, some work has been done in the context of a tool named Parasight (see, e.g.,"Non-intrusive and interactive profiling in Parasight," Ziya Aral and Ilya Gemter, Proceedings of the ACM/SIG PLAN PEALS 1988, Parallel Programming: Experience with Applications, Languages and Systems, pages 21-30. July 1988) that provides some support for consistent tracing. However, they do not address the issues due to garbage collection since it does not exist for normal C programs.
In a significant departure over the prior art, the present invention provides methodologies for supporting partial and consistent monitoring of object-oriented programs. These methods, when incorporated in program execution environments, can be used to produce consistent information which can be used by batch and interactive tools such as, for example, program analyzers, visualizers and debuggers for purposes such as program understanding, visualization and debugging.
For the description of this embodiment of the invention, we will assume without loss of generality that a virtual machine (VM) executes the object-oriented system of interest and events occurring as part of the system execution are delivered to external event listeners which have to register to a single event source, exported by the running VM. Notice that the VM does not take care of maintaining any history of generated events. That would be the responsibility of listeners.
The invention provides a monitoring methodology that is composed of two parts: (i) a categorization of events that fully describe the execution of an object-oriented program; and (ii) a set of sequencing rules in order to ensure the consistency of the event stream. These parts will be explained in detail below. Referring back to FIG. 1, it is to be appreciated that this inventive methodology may be implemented by the monitoring agent 8 (also referred to herein as the monitoring system or subsystem) within the program execution environment 4. The object-oriented program would then be program 2 in FIG. 1. The event stream
is reported by the monitoring agent 8 to, for example, the program event processing tool 10. Reporting may be controlled by the event reporting controller 12.
Thus, the invention applies to any system that needs to support consistent partial monitoring and whose events can 5 be expressed as belonging to one of the specified event categories.
Categories of Monitoring Events
According to one embodiment of the invention, we group monitoring events into four categories: (i) type events; (ii) 10 object events; (iii) execution events; and (iv) synchronization events.
Type events are the definition of types allowing -to describe objects which are instances of types. A type is either a basic type, a class or an interface. Basic types are the 15 classical ones such as integers, floats, etc. A class has a name, a class it is derived from (called super class), a list of implemented interfaces, a set of methods that it implements, and a set of fields it defines. An interface has a name, a set of interfaces it extends, and a set of methods and constants 20 it declares. A method has a return type, a name, and a list of parameter types (the receiver being implicit and compatible with the class implementing the method). Afield has a name and a type.
We define three events on types: create, load, and reclaim. 25 The rationale for three events is further elaborated in the following section (Sequencing Rules). The create event specifies the name of the type and monitoring identity. The load event defines the type. It refers to the super class and to the implemented interfaces. It also includes the descrip- 30 tion of the class members, that is, the methods and the fields. Method descriptions refer to the return and parameter types. Field description refers to the field type. The reclaim event indicates the type is no longer known to the monitored system. 35
Object events report the life cycle of objects (creation, destruction) as well as the object graph, that is, references between objects. Objects are created upon explicit request and later reclaimed. Whenever an object is allocated, a create event is generated; whenever an object is reclaimed, 40 a reclaim event is generated. The create event includes the monitoring identity of the created object and refers to the class of the object. The reclaim event just refers to the reclaimed object.
A reference identifies at runtime an object and allows an 45 object Ato refer to an object B. Areference event reports the existence of a reference from a pointing-to object to a pointed-to object (potentially identical). A reference event refers to the field (and may also refer to its declaring class) containing the reference and to the referred-to object. 50
Execution events report the procedural aspect of the monitored system, although retaining its object-oriented characteristics. First of all, two events report the creation and destruction of a thread. The create event provides the thread name and its monitoring identity. The destruction 55 event just refers to the reclaimed thread. A renaming event is also provided allowing to keep track of name changes.
A thread executes method invocations following a LastIn-First-Out (LIFO) model. In other words, if a thread executes a method A that invokes another method B, then the 60 LIFO model states the execution of method B completes before the execution of method A. It can be said that method A is the parent of method B. A method invocation represents the execution of the method instructions. Two events are used to report method invocation: an enter and a leave event. 65 The enter event refers to the thread, the class implementing the currently executing method, and the receiver object. The
leave event only refers to the thread. This is sufficient because of the LIFO model.
Synchronization events report the activity regarding critical sections and locks. Many different synchronization semantics exist and are used by different systems or even sometimes combined. However, most synchronization mechanisms ultimately rely on locks. A thread may acquire a lock, release it if it owns it, or be suspended awaiting for the lock to become available. Often, locks are associated with objects. We define four events: acquire, release, beginwait and end-wait. All synchronization events refer to the involved thread and the associated object, if applicable. Sequencing Rules
Events have to be sequenced in a certain way so to ensure that the reported event stream is consistent. We first assume full reporting in this section, that is, each listener is sent all events and in the same order; Then we will consider partial reporting.
When full reporting is assumed, there are only two fundamental sequencing rules:
1. The entity event reporting the creation of an entity must precede the event reporting its reclamation (death).
2. Any activity events referring to an entity must appear between the entity events reporting the creation and reclamation of that entity.
Although these two rules seem very intuitive, their enforcement is tricky in some cases. We explained earlier that we define three events for types: the create, load, and reclaim events. The rationale is a sequencing one. Type definitions are by definition recursive and potentially cyclic. This suggests to separate the definition of a type from its mere existence so to be able to break cyclic dependencies in type definitions. The create event must precede the load event which must precede the reclaim event. Any other event referring to a class must appear after the load event for that class. For certain languages, maintaining such a precedence invariant in the VM can be tricky. For instance, consider the static initializer feature in the Java language. Static initializers are snippets of code which initialize class static fields at load time. Care must be taken to issue the class load event before any static initializer is run, otherwise the sequencing rules would be violated, having events such as enter and leave events referring to a non-loaded class.
Consider the more interesting case of partial reporting. In this context, listeners are allowed to get only a partial view of the running system. Many different ways have been designed to specify which partial view. Some of the partial reporting options that can be provided include: (i) listen only to one kind of events (type, execution, object, or synchronization); (ii) filtering events on a thread basis; and (iii) suspend and resume monitoring, globally or selectively per thread.
In the event of partial reporting, the previous sequencing rules are not enough to ensure consistency. The following three rules must be added to the above two rules:
3. No defined entity may be reclaimed without its reclamation being reported.
4. No invocation may be reported without its parent being reported.
5. The four synchronization events (acquire, release, beginwait, end-wait) must always exhibit a well-formed sequence.
These rules have to be maintained by the running system as do the first two sequencing rules. However, it is important to notice that even though reporting has been totally or partially turned off, maintaining these rules remains a requirement. In other words, monitoring is never turned off as a subsystem, it may only not report anything.