Search Images Maps Play YouTube News Gmail Drive More »
Advanced Patent Search | Web History | Sign in

Patents

This invention relates generally to a system for processing database queries, and more particularly to a method for generating high level language or machine code to implement query execution plans. The present invention provides a method for generating executable machine code for query execution plans, that is adaptive to dynamic runtime conditions, that is compiled just in time for execution and most importantly, that avoids the bounds checking, pointer indirection, materialization and other similar kinds of overhead that are typical in interpretive runtime execution engines.

InventorsBarry M. Zane, James P. Ballard, Foster D. Hinshaw, Dana A. Kirkpatrick, Premanand Yerabothu
Original AssigneeNetezza Corporaton
Primary Examiner: Diane Mizrahi
Attorney: Hamilton, Brook, Smith & Reynolds, P.C.
Current U.S. Classification1/1; 707/999.003; 707/999.004; 707/999.005

View patent at USPTO
Search USPTO Assignment Database

Citations

Cited PatentFiling dateIssue dateOriginal AssigneeTitle
US4829427May 25, 1984May 9, 1989Data General CorporationDatabase query code generation and optimization based on the cost of alternate access methods
US5091852Jan 25, 1989Feb 25, 1992Hitachi, Ltd.System for optimizing query processing in a relational database
US5301317Apr 27, 1992Apr 5, 1994International Business Machines CorporationSystem for adapting query optimization effort to expected execution time
US5806059Mar 4, 1997Sep 8, 1998Hitachi, Ltd.Database management system and method for query process for the same
US5875334Oct 27, 1995Feb 23, 1999International Business Machines CorporationSystem, method, and program for extending a SQL compiler for handling control statements packaged with SQL query statements
US5930795Jan 21, 1997Jul 27, 1999International Business Machines CorporationSupporting dynamic tables in SQL query compilers
US6219660Sep 8, 1999Apr 17, 2001International Business Machines CorporationAccess path selection for SQL with variables in a RDBMS
US6353819Sep 29, 1999Mar 5, 2002Bull HN Information Systems Inc.Method and system for using dynamically generated code to perform record management layer functions in a relational database manager
US6907546Mar 27, 2000Jun 14, 2005Accenture LLPLanguage-driven interface for an automated testing framework
US7089448Sep 18, 2003Aug 8, 2006Netezza CorporationDisk mirror architecture for database appliance
US7203678Mar 27, 2001Apr 10, 2007BEA Systems, Inc.Reconfigurable query generation system for web browsers
US20010052108Aug 31, 1999SYSTEM, METHOD AND ARTICLE OF MANUFACTURING FOR A DEVELOPMENT ARCHITECTURE FRAMEWORK
US20030233632Jun 12, 2002Lockheed Martin CorporationAutomatically generated client application source code using database table definitions
US20040148420Sep 18, 2003Netezza CorporationProgrammable streaming data processor for database appliance having multiple processing unit groups
US20040181537Dec 16, 2003SYBASE, INC.System with Methodology for Executing Relational Operations Over Relational Data and Data Retrieved from SOAP Operations
US20060129605Feb 2, 2006System and method for automating the development of web services that incorporate business rules

Referenced by

Citing PatentFiling dateIssue dateOriginal AssigneeTitle
US7590620Sep 29, 2004Sep 15, 2009Google Inc.System and method for analyzing data records
US7599949Apr 30, 2004Oct 6, 2009Unisys CorporationDatabase management system and method for maintaining a database in a range sensitive manner
US7966340Sep 8, 2010Jun 21, 2011Aster Data Systems, Inc.System and method of massively parallel data processing
US8073826Oct 18, 2007Dec 6, 2011Oracle International CorporationSupport for user defined functions in a data stream management system
US8126909Jul 31, 2009Feb 28, 2012Google Inc.System and method for analyzing data records
US8145655Jun 22, 2007Mar 27, 2012International Business Machines CorporationGenerating information on database queries in source code into object code compiled from the source code
US8145859Mar 2, 2009Mar 27, 2012Oracle International CorporationMethod and system for spilling from a queue to a persistent store

Claims

1. A method for generating machine executable code for implementing a query of a database, the database having tables and records of data, comprising the steps of:

receiving a subject query;

forming an execution plan corresponding to the subject query, the execution plan having a sequence of pieces and corresponding processes for implementing the pieces;

for each piece of the plan, (a) generating source code using different code generation techniques as a function of any combination of data characteristics, current conditions and workload, and (b) compiling the generated source code to form machine executable code for implementing the subject query, said compiling being in a manner that optimizes total query processing time, including, compilation time and execution time,

wherein the subject query includes a join operation; and
wherein the step of generating source code includes (a) representing output stream of the join operation as local variables that reference current records in each input stream, and (b) projecting named fields of the records, such that use of intermediate materialization and auxiliary structures are minimized, the step of projecting includes utilizing structure offset expressions in the generated source code.

2. A method as claimed in claim 1 wherein the data characteristics include any combination of precision of a value existing in the database, scale of a value existing in the database, size of affected data existing in the database and data type.

3. A method as claimed in claim 1 wherein the step of generating further includes selecting code generation techniques based on intermediate results from earlier pieces, such that source code generation is effectively adapted to dynamic conditions.

4. A method as claimed in claim 1 wherein: the subject query is in a language native to the database; and the generated source code is in a high level language.

5. A method as claimed in claim 1 wherein the step of generating source code includes generating source code for each process of each piece of the plan on a need-only basis, such that a savings on compilation time is achieved.

6. A method as claimed in claim 5 wherein the need only basis provides generating source code for only those relevant members of a declared structure.

7. A method as claimed in claim 1 wherein the step of generating source code includes: (a) minimally defining relevant structures and classes, and (b) forming therefrom optimized include statements in the source code, the optimized include statements enabling reduced compilation time.

8. A method as claimed in claim 1 wherein the step of generating source code includes, adjusting variables in the source code to have widths no larger than widths of actual data values of respective data in the database.

9. A method as claimed in claim 1 wherein the step of generating source code uses different combinations of the code generation techniques to provide different degrees of optimization in compiling.

10. A method as claimed in claim 1 wherein the step of generating source code further includes for each instance of a function call to a respective function recited in the source code, (i) determining size of data affected by the function call, and (ii) based on determined data size, replacing the instance of the function call in the source code with source code for implementing the respective function, such that the respective function is coded in-line.

11. A method as claimed in claim 10 wherein the step of replacing is performed as a function of determined data size relative to a threshold.

12. A method as claimed in claim 1 wherein the subject query includes an outer join operation; and

the step of generating source code includes effectively overwriting field references in the outer join operation with null values.

13. A method as claimed in claim 12 wherein the step of generating source code includes using a null value indicator for overwriting field references in the outer join operation.

14. A method for generating machine executable code for implementing a query of a database, the database having tables and records of data, comprising the steps of:

receiving a subject query having a join operation;

forming an execution plan corresponding to the subject query, the execution plan having a sequence of pieces and corresponding processing nodes for implementing the pieces;

generating high level language source code for each piece of the plan including (a) minimally defining relevant structures and classes, and (b) forming therefrom optimized Include statements in the source code; and

compiling the generated source code to form machine executable code for implementing the subject query, the formed optimized Include statements in the source code enabling relatively reduced compilation time,
the step of generating high level language source code includes (a) representing output stream of the join operation as local variables that reference current records in each input stream, and (b) projecting named fields of the records, such that use of intermediate materialization and auxiliary structures are minimized, the step of projecting includes utilizing structure offset expressions in the generated source code.

15. A method as claimed in claim 14 wherein the step of forming optimized Include statements in the source code includes generating source code for only these relevant members of a declared structure.

16. A method as claimed in claim 14 wherein the step of minimally defining includes inserting in-line into the generated source code respective local declarations of a relevant structures and classes defining only those class members and structure members that will be used by the generated source code.

17. A method as claimed in claim 14 wherein:

the high level language is C++; and

the step of generating source code includes omitting a traditional #Include statement of a header of a structure declaration in the generated source code in order to minimize compile time for the generated source code.