This invention relates generally to a system for processing database queries, and more particularly to a method for generating high level language or machine code to implement query execution plans. The present invention provides a method for generating executable machine code for query execution plans, that is adaptive to dynamic runtime conditions, that is compiled just in time for execution and most importantly, that avoids the bounds checking, pointer indirection, materialization and other similar kinds of overhead that are typical in interpretive runtime execution engines. |
Citations|
| US4829427 | May 25, 1984 | May 9, 1989 | Data General Corporation | Database query code generation and optimization based on the cost of alternate access methods | | US5091852 | Jan 25, 1989 | Feb 25, 1992 | Hitachi, Ltd. | System for optimizing query processing in a relational database | | US5301317 | Apr 27, 1992 | Apr 5, 1994 | International Business Machines Corporation | System for adapting query optimization effort to expected execution time | | US5806059 | Mar 4, 1997 | Sep 8, 1998 | Hitachi, Ltd. | Database management system and method for query process for the same | | US5875334 | Oct 27, 1995 | Feb 23, 1999 | International Business Machines Corporation | System, method, and program for extending a SQL compiler for handling control statements packaged with SQL query statements | | US5930795 | Jan 21, 1997 | Jul 27, 1999 | International Business Machines Corporation | Supporting dynamic tables in SQL query compilers | | US6219660 | Sep 8, 1999 | Apr 17, 2001 | International Business Machines Corporation | Access path selection for SQL with variables in a RDBMS | | US6353819 | Sep 29, 1999 | Mar 5, 2002 | Bull HN Information Systems Inc. | Method and system for using dynamically generated code to perform record management layer functions in a relational database manager | | US6907546 | Mar 27, 2000 | Jun 14, 2005 | Accenture LLP | Language-driven interface for an automated testing framework | | US7089448 | Sep 18, 2003 | Aug 8, 2006 | Netezza Corporation | Disk mirror architecture for database appliance | | US7203678 | Mar 27, 2001 | Apr 10, 2007 | BEA Systems, Inc. | Reconfigurable query generation system for web browsers | | US20010052108 | Aug 31, 1999 | | | SYSTEM, METHOD AND ARTICLE OF MANUFACTURING FOR A DEVELOPMENT ARCHITECTURE FRAMEWORK | | US20030233632 | Jun 12, 2002 | | Lockheed Martin Corporation | Automatically generated client application source code using database table definitions | | US20040148420 | Sep 18, 2003 | | Netezza Corporation | Programmable streaming data processor for database appliance having multiple processing unit groups | | US20040181537 | Dec 16, 2003 | | SYBASE, INC. | System with Methodology for Executing Relational Operations Over Relational Data and Data Retrieved from SOAP Operations | | US20060129605 | Feb 2, 2006 | | | System and method for automating the development of web services that incorporate business rules |
Referenced by|
| US7590620 | Sep 29, 2004 | Sep 15, 2009 | Google Inc. | System and method for analyzing data records | | US7599949 | Apr 30, 2004 | Oct 6, 2009 | Unisys Corporation | Database management system and method for maintaining a database in a range sensitive manner | | US7966340 | Sep 8, 2010 | Jun 21, 2011 | Aster Data Systems, Inc. | System and method of massively parallel data processing | | US8073826 | Oct 18, 2007 | Dec 6, 2011 | Oracle International Corporation | Support for user defined functions in a data stream management system | | US8126909 | Jul 31, 2009 | Feb 28, 2012 | Google Inc. | System and method for analyzing data records | | US8145655 | Jun 22, 2007 | Mar 27, 2012 | International Business Machines Corporation | Generating information on database queries in source code into object code compiled from the source code | | US8145859 | Mar 2, 2009 | Mar 27, 2012 | Oracle International Corporation | Method and system for spilling from a queue to a persistent store |
Claims1. A method for generating machine executable code for implementing a query of a database, the database having tables and records of data, comprising the steps of: - receiving a subject query;
- forming an execution plan corresponding to the subject query, the execution plan having a sequence of pieces and corresponding processes for implementing the pieces;
- for each piece of the plan, (a) generating source code using different code generation techniques as a function of any combination of data characteristics, current conditions and workload, and (b) compiling the generated source code to form machine executable code for implementing the subject query, said compiling being in a manner that optimizes total query processing time, including, compilation time and execution time,
- wherein the subject query includes a join operation; and
- wherein the step of generating source code includes (a) representing output stream of the join operation as local variables that reference current records in each input stream, and (b) projecting named fields of the records, such that use of intermediate materialization and auxiliary structures are minimized, the step of projecting includes utilizing structure offset expressions in the generated source code.
2. A method as claimed in claim 1 wherein the data characteristics include any combination of precision of a value existing in the database, scale of a value existing in the database, size of affected data existing in the database and data type. 3. A method as claimed in claim 1 wherein the step of generating further includes selecting code generation techniques based on intermediate results from earlier pieces, such that source code generation is effectively adapted to dynamic conditions. 4. A method as claimed in claim 1 wherein: the subject query is in a language native to the database; and the generated source code is in a high level language. 5. A method as claimed in claim 1 wherein the step of generating source code includes generating source code for each process of each piece of the plan on a need-only basis, such that a savings on compilation time is achieved. 6. A method as claimed in claim 5 wherein the need only basis provides generating source code for only those relevant members of a declared structure. 7. A method as claimed in claim 1 wherein the step of generating source code includes: (a) minimally defining relevant structures and classes, and (b) forming therefrom optimized include statements in the source code, the optimized include statements enabling reduced compilation time. 8. A method as claimed in claim 1 wherein the step of generating source code includes, adjusting variables in the source code to have widths no larger than widths of actual data values of respective data in the database. 9. A method as claimed in claim 1 wherein the step of generating source code uses different combinations of the code generation techniques to provide different degrees of optimization in compiling. 10. A method as claimed in claim 1 wherein the step of generating source code further includes for each instance of a function call to a respective function recited in the source code, (i) determining size of data affected by the function call, and (ii) based on determined data size, replacing the instance of the function call in the source code with source code for implementing the respective function, such that the respective function is coded in-line. 11. A method as claimed in claim 10 wherein the step of replacing is performed as a function of determined data size relative to a threshold. 12. A method as claimed in claim 1 wherein the subject query includes an outer join operation; and - the step of generating source code includes effectively overwriting field references in the outer join operation with null values.
13. A method as claimed in claim 12 wherein the step of generating source code includes using a null value indicator for overwriting field references in the outer join operation. 14. A method for generating machine executable code for implementing a query of a database, the database having tables and records of data, comprising the steps of: - receiving a subject query having a join operation;
- forming an execution plan corresponding to the subject query, the execution plan having a sequence of pieces and corresponding processing nodes for implementing the pieces;
- generating high level language source code for each piece of the plan including (a) minimally defining relevant structures and classes, and (b) forming therefrom optimized Include statements in the source code; and
- compiling the generated source code to form machine executable code for implementing the subject query, the formed optimized Include statements in the source code enabling relatively reduced compilation time,
- the step of generating high level language source code includes (a) representing output stream of the join operation as local variables that reference current records in each input stream, and (b) projecting named fields of the records, such that use of intermediate materialization and auxiliary structures are minimized, the step of projecting includes utilizing structure offset expressions in the generated source code.
15. A method as claimed in claim 14 wherein the step of forming optimized Include statements in the source code includes generating source code for only these relevant members of a declared structure. 16. A method as claimed in claim 14 wherein the step of minimally defining includes inserting in-line into the generated source code respective local declarations of a relevant structures and classes defining only those class members and structure members that will be used by the generated source code. 17. A method as claimed in claim 14 wherein: - the high level language is C++; and
- the step of generating source code includes omitting a traditional #Include statement of a header of a structure declaration in the generated source code in order to minimize compile time for the generated source code.
|