|Publication number||US7620987 B2|
|Application number||US 11/203,676|
|Publication date||Nov 17, 2009|
|Priority date||Aug 12, 2005|
|Also published as||US20070039048|
|Publication number||11203676, 203676, US 7620987 B2, US 7620987B2, US-B2-7620987, US7620987 B2, US7620987B2|
|Inventors||Art Shelest, Gregory D. Hartrell|
|Original Assignee||Microsoft Corporation|
|Export Citation||BiBTeX, EndNote, RefMan|
|Patent Citations (34), Referenced by (33), Classifications (8), Legal Events (4)|
|External Links: USPTO, USPTO Assignment, Espacenet|
Although the Internet has had great successes in facilitating communications between computer systems and enabling electronic commerce, the computer systems connected to the Internet have been under almost constant attack by hackers seeking to disrupt their operation. Many of the attacks seek to exploit vulnerabilities of the application programs, operating systems, and other computer programs executing on those computer systems. One of the most destructive methods of attacking a computer system has been to modify portions of the operating system with software that may perform many of the same functions of the operating system, but also includes malicious functions. These modifications can be either to replace portions of the operating system or to add new programs that are automatically started by the operating system. Such software is referred to as “malware” because of its malicious nature. Once malware is installed, the operating system is “infected” and the malware can control all aspects of the computer system. Such malware includes RootKits, Trojan horses, keystroke loggers, and so on. For example, the malware could intercept keystrokes that a user enters (e.g., a password) and report them to another computer system. As another example, the malware could be a worm that launches a self-propagating attack that exploits a vulnerability of a computer system by taking control and using that computer system to find other computer systems with the same vulnerability and launch attacks (i.e., sending the same worm) against them. To launch an attack that exploits the same vulnerability, the malware assumes that all to-be-attacked computer systems locate their resources in the same way. For example, malware may operate by overwriting an entry in a system call table so that system calls through that entry are routed to the malware. The malware may assume that the system call table is stored at the same location of each computer system or that its location can be found in the same way (i.e., indirectly through a memory location that contains a pointer to the table).
In addition to infecting an operating system, malware can also infect various applications. One virus, known as the “Slammer” virus, infects SQL server software. This virus takes control of the server by sending a SQL message that causes a buffer to overflow, which causes the data of the message to overwrite the server's stack with instructions including the address of a memory location (e.g., a return address stored in the stack) to which the SQL server jumps. When the SQL server jumps to the memory location specified by the overwritten address, the malware starts to execute. Because the SQL server executes with a high privilege level, the malware can effectively take control of the server. Once the malware takes control, it can access application and system resources to perform its malicious behavior. The Slammer virus relies on the information of each server's stack being stored in the same way so that it knows where to store its address to effect the taking over of the server.
Since all installations of a certain version of a program (e.g., application or an operating system) are typically identical, once a hacker develops malware to infect a program, that malware can be used to infect all installations of the program in the same way.
A method and system for obfuscating computer code of a program to protect it from the adverse effects of malware is provided. The obfuscation system retrieves an executable form of the computer code. The obfuscation system then selects various obfuscation techniques to use in obfuscating the computer code. The obfuscation system applies the selected obfuscation techniques to the computer code. When the obfuscation techniques are applied to the computer code, the obfuscation system may need to fix up the computer code to ensure that the obfuscated computer code has functionally the same behavior. The obfuscation system then causes the obfuscated computer code to execute. Malware may find it difficult to find resources of the obfuscated code that are needed to infect the code.
This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.
A method and system for obfuscating computer code of a program to protect it from the adverse effects of malware is provided. In one embodiment, the obfuscation system retrieves an executable form of the computer code. For example, if the computer code is part of an application program, then when the application program is to be executed, the obfuscation system may load the executable file for that application program into memory for execution. The obfuscation system then selects various obfuscation techniques to use in obfuscating the computer code. For example, one obfuscation technique may be to rearrange portions of the executable code so that its behavior is still functionally the same, but malware may find it difficult to locate certain resources of the code because of the rearranging. The obfuscation system then applies the selected obfuscation techniques to the computer code. When the obfuscation techniques are applied to the computer code, the obfuscation system may need to fix up the computer code (e.g., change references to moved functions or moved tables) to ensure that the obfuscated computer code has functionally the same behavior. The obfuscation system then causes the obfuscated computer code to execute. In this way, it may be difficult for malware to find resources of the obfuscated code that are needed to infect the code. For example, if SQL server code had been obfuscated by adding random padding onto the stack, then the Slammer virus would not have been able to rely on the stack of each SQL server having its data stored in exactly the same position relative to the top of the stack. As another example, if certain system calls or pointers to the system calls of the operating system were moved to different locations on each SQL server, then the Slammer virus would not have been able to reliably invoke a desired system call. Although the Slammer virus may have caused obfuscated SQL server code whose buffer was overwritten to fail, the server would have likely failed without spreading the virus, which would have minimized the collective adverse effects of the virus on SQL servers.
The obfuscation system may obfuscate computer code at different times depending on computational expense versus security trade-offs. The obfuscation system may obfuscate the operating system or an application program at the time of installation on a computer system. For example, the obfuscation system may be implemented as a component of an installer program. Before the program to be installed is stored on a storage device of the computer system, the obfuscation system is invoked by the installer to obfuscate the computer code of the program. The obfuscation system may randomly select the obfuscation techniques to apply to a program so that each installation of the program will be obfuscated in a different way. The installer then stores that obfuscated computer code on the storage device of the computer system. Thus, each installation of a program will be different, but each time the program is executed, its computer code has the same obfuscation as the last time it was executed. Alternatively or in addition, the obfuscation system may obfuscate the computer code of a program each time that it is loaded for execution. For example, the obfuscation system may be implemented as a component of the loader program. When a program is to be loaded for execution, the loader loads the executable code of the program into memory, randomly selects obfuscation techniques, applies those selected obfuscation techniques to the executable code, and then starts the execution of the program. Since the obfuscation system randomly selects the obfuscation techniques to apply, each execution instance of the program will be different. Although the obfuscation of computer code at the time of execution may provide a higher degree of security against malware than obfuscation only at installation, the overhead of performing the obfuscation at execution time may outweigh the additional security benefits.
In one embodiment, the obfuscation system may randomly select the obfuscation techniques that are to be applied to a program. This random selection helps ensure that each installation or executable instance of the program will be obfuscated in a different way, making it difficult for malware to rely on the program storing and accessing resources in a uniform manner. The obfuscation techniques may include equivalent code sequence substitution, reordering code blocks, reordering import tables, varying stack frames, inserting inert instructions, reordering static data, renaming binaries, encrypting computer code, and so on as described below in more detail.
The obfuscation technique of equivalent code sequence substitution seeks to change the location of code blocks within a program so that malware cannot rely on fixed locations of code blocks. If malware cannot rely on a function being at a certain location, then it may have no effective way of invoking that function. The equivalent code sequence substitution may apply many different substitution techniques to alter the size of a code block and thus change the location of subsequent code blocks. For example, one substitution technique may replace an addition operation with a more complex operation that generates the same result. For example, an instruction that increments the value of a register may be replaced by a sequence of instructions that adds 2 to the register value and then subtracts 1 from the register value. Because the substitution results in code blocks being relocated, the obfuscation system needs to track transfer instructions (e.g., jumps and calls) and then fix up those instructions to reflect the relocation of their targets.
The obfuscation technique of reordering code blocks seeks to hide the location of code blocks that may be needed by the malware. A code block may be a basic block in the sense that it is a sequence of instructions that has only one entry point from outside the code block. Although basic blocks are generally considered to be the shortest sequences of such instructions, several basic blocks can be combined into a larger code block that has only one entry point from outside the code block. The ordering of code blocks is important to correct operation of a program because each code block that does not end in jump instruction may rely on the first instruction of the following code block being executed after its last instruction. Thus, when code blocks are reordered, the obfuscation system may need to insert jump instructions to ensure that the execution order of the code blocks (i.e., code path) is preserved, although the in-memory order of the code blocks is not preserved. In addition, the obfuscation system needs to fix up transfer instructions to reflect the relocations of their target code blocks. By randomly reordering code blocks, the obfuscation system can help ensure that each installation or instance of a program has a unique arrangement of its code blocks. Alternatively or in addition, the obfuscation system can insert inert code blocks or instructions into the computer code to change the offset of a code block without affecting the behavior of the program. For example, the obfuscation system can insert loops that swap the location of data and then re-swaps the data back to its original location, insert no-operation instructions, insert instructions that increment useless variables, and so on.
The obfuscation technique of reordering import tables again seeks to change the ordering in memory of code blocks. The import table of an application program identifies code segments (e.g., dynamic link libraries) that are to be loaded into memory when the application program is executed. The loader typically loads the code segments into memory in the same order as they are identified in the import table. The obfuscation system may randomly reorder an import table so that a loader will load the code segments in a different order and thus at different locations. If code of the application program relies on a code segment being at a certain location, then the obfuscation system needs to fix up the transfer instructions.
The obfuscation technique of varying stack frames seeks to modify the locations of certain data on the stack such as return addresses. The obfuscation system can vary stack frames in different ways to make it difficult for malware to rely on a consistent ordering and location of stack data. For example, the obfuscation system may add instructions that add and remove padding bytes to the stack at various times during execution of the program. As another example, the obfuscation system may add additional call frames to the stack by, for example, adding a wrapper to a function call. Thus, when the function is invoked by the wrapper, the additional frame of the wrapper will cause the offsets to data on the stack to be changed. Malware that tries to access the data on the stack assuming a known offset (e.g., the Slammer virus) will likely fail when the stack frame is varied. The obfuscation system may wrap a function a variable number of times that is randomly selected to vary the outset from installation to installation or from execution to execution.
The obfuscation technique of renaming binaries seeks to make it difficult for malware to locate executable files of a storage medium that it seeks to infect. The obfuscation system may generate random names for the executable files or randomly swap the names of the executable files. To prevent malware from identifying a desired executable file by simply searching for code sequences within the executable files, rather than by name, the obfuscation system may encrypt the executable code. The obfuscation system can encrypt executable code that is stored on disk at installation and decrypt the executable code when it is to be executed. This encryption will help prevent malware from infecting the executable code that is stored on disk. Alternatively or in addition, the obfuscation system can store encrypted executable code in memory and decrypt portions of the executable code on an as-needed basis during execution. The encryption can be a complex encryption algorithm or a simple encryption algorithm (e.g., XOR'ing the code). The obfuscation system may add calls to a function to decrypt portions of the code before execution and calls to a function to encrypt portions of the code after execution.
In one embodiment, the obfuscation system may replace or wrap functions that are commonly invoked by malware with a function, referred to as a “minefield” function, that detects and reports the presence of the malware. The obfuscation system may copy the function to a new location and replace it with a minefield function that detects whether the invoking code is malware. If the invoking code is malware, then the minefield function may secretly report that the computer system has been infected or take some other anti-malware action. If the invoking code is not malware, then the minefield function may invoke the moved function to effect the normal behavior of the function.
The computing device on which the obfuscation system is implemented may include a central processing unit, memory, input devices (e.g., keyboard and pointing devices), output devices (e.g., display devices), and storage devices (e.g., disk drives). The memory and storage devices are computer-readable media that may contain instructions that implement the obfuscation system. In addition, the data structures and message structures may be stored or transmitted via a data transmission medium, such as a signal on a communications link. Various communications links may be used, such as the Internet, a local area network, a wide area network, or a point-to-point dial-up connection.
The obfuscation system may be implemented in various operating environments that include personal computers, server computers, hand-held or laptop devices, multiprocessor systems, microprocessor-based systems, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like. The obfuscation system may be implemented on computing devices that include personal digital assistants (“PDAs”), cell phones, consumer electronic devices (e.g., audio playback devices), game devices, and so on.
The obfuscation system may be described in the general context of computer-executable instructions, such as program modules, executed by one or more computers or other devices. Generally, program modules include routines, programs, objects, components, data structures, and so on that perform particular tasks or implement particular abstract data types. Typically, the functionality of the program modules may be combined or distributed as desired in various embodiments.
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims. For example, the obfuscation system may be built into a program (e.g., application program) so that each time the program is to be loaded for execution the program automatically obfuscates itself in different ways (e.g., randomly selected obfuscation). The obfuscation system may also be implemented as a server that downloads obfuscated code to clients for execution. For example, an organization may have a server that maintains a copy of programs that are downloaded to its clients each time a user requests to execute a program. The server can obfuscate the executable code before download. This removes the obfuscation overhead from the client, which may not have significant computational power (e.g., a cell phone). In addition, since a program is not stored in the client persistently, the chances of being infected by malware are further reduced. Accordingly, the invention is not limited except as by the appended claims.
From the foregoing, it will be appreciated that specific embodiments of the invention have been described herein for purposes of illustration, but that various modifications may be made without deviating from the spirit and scope of the invention. Accordingly, the invention is not limited except as by the appended claims.
|Cited Patent||Filing date||Publication date||Applicant||Title|
|US5559884||Jun 30, 1994||Sep 24, 1996||Microsoft Corporation||Method and system for generating and auditing a signature for a computer program|
|US6006328 *||Jul 12, 1996||Dec 21, 1999||Christopher N. Drake||Computer software authentication, protection, and security system|
|US6009525 *||Aug 29, 1997||Dec 28, 1999||Preview Systems, Inc.||Multi-tier electronic software distribution|
|US6205550 *||Sep 5, 1997||Mar 20, 2001||Intel Corporation||Tamper resistant methods and apparatus|
|US6643775 *||Nov 20, 1998||Nov 4, 2003||Jamama, Llc||Use of code obfuscation to inhibit generation of non-use-restricted versions of copy protected software applications|
|US6668325 *||Jun 9, 1998||Dec 23, 2003||Intertrust Technologies||Obfuscation techniques for enhancing software security|
|US6782478 *||Apr 28, 1999||Aug 24, 2004||Thomas Probert||Techniques for encoding information in computer code|
|US6868495 *||Jul 21, 2000||Mar 15, 2005||Open Security Solutions, Llc||One-time pad Encryption key Distribution|
|US7051200 *||Jun 27, 2000||May 23, 2006||Microsoft Corporation||System and method for interfacing a software process to secure repositories|
|US7054443 *||Mar 27, 2000||May 30, 2006||Microsoft Corporation||System and method for protecting digital goods using random and automatic code obfuscation|
|US7080257 *||Aug 30, 2000||Jul 18, 2006||Microsoft Corporation||Protecting digital goods using oblivious checking|
|US7237123 *||Nov 20, 2001||Jun 26, 2007||Ecd Systems, Inc.||Systems and methods for preventing unauthorized use of digital content|
|US7254586 *||Jun 28, 2002||Aug 7, 2007||Microsoft Corporation||Secure and opaque type library providing secure data protection of variables|
|US7263722 *||May 12, 2000||Aug 28, 2007||Fraunhofer Crcg, Inc.||Obfuscation of executable code|
|US7370319 *||Feb 10, 2004||May 6, 2008||V.I. Laboratories, Inc.||System and method for regulating execution of computer software|
|US7383443 *||Jun 27, 2002||Jun 3, 2008||Microsoft Corporation||System and method for obfuscating code using instruction replacement scheme|
|US7415618 *||Sep 25, 2003||Aug 19, 2008||Sun Microsystems, Inc.||Permutation of opcode values for application program obfuscation|
|US7421586 *||Sep 4, 2003||Sep 2, 2008||Fraunhofer Gesselschaft||Protecting mobile code against malicious hosts|
|US7424620 *||Sep 25, 2003||Sep 9, 2008||Sun Microsystems, Inc.||Interleaved data and instruction streams for application program obfuscation|
|US7430670 *||Jul 31, 2000||Sep 30, 2008||Intertrust Technologies Corp.||Software self-defense systems and methods|
|US7454323 *||Aug 22, 2003||Nov 18, 2008||Altera Corporation||Method for creation of secure simulation models|
|US7539875 *||Jun 27, 2000||May 26, 2009||Microsoft Corporation||Secure repository with layers of tamper resistance and system and method for providing same|
|US20020138748 *||Mar 21, 2001||Sep 26, 2002||Hung Andy C.||Code checksums for relocatable code|
|US20030191938 *||Apr 9, 2002||Oct 9, 2003||Solarsoft Ltd.||Computer security system and method|
|US20040003264 *||Jun 27, 2002||Jan 1, 2004||Pavel Zeman||System and method for obfuscating code using instruction replacement scheme|
|US20040003278 *||Jun 28, 2002||Jan 1, 2004||Microsoft Corporation||Secure and opaque type library providing secure data protection of variables|
|US20050021613 *||Jul 13, 2004||Jan 27, 2005||Softricity, Inc.||Method and apparatus for content protection in a secure content delivery system|
|US20050204348 *||May 13, 2005||Sep 15, 2005||Inter Trust Technologies Corporation||Software self-defense systems and methods|
|US20050210275 *||May 11, 2005||Sep 22, 2005||Intertrust Technologies Corporation||Software self-defense systems and methods|
|US20060005021 *||Jul 7, 2005||Jan 5, 2006||Andres Torrubia-Saez||Methods and apparatus for secure distribution of software|
|US20060031686 *||Jul 27, 2005||Feb 9, 2006||Purdue Research Foundation||Method and system for tamperproofing software|
|US20060195703 *||Feb 25, 2005||Aug 31, 2006||Microsoft Corporation||System and method of iterative code obfuscation|
|US20060195906 *||Feb 26, 2005||Aug 31, 2006||International Business Machines Corporation||System, method, and service for detecting improper manipulation of an application|
|US20070288715 *||Jun 14, 2005||Dec 13, 2007||Rok Productions Limited||Media Player|
|Citing Patent||Filing date||Publication date||Applicant||Title|
|US8176337 *||May 8, 2012||Apple Inc.||Computer object code obfuscation using boot installation|
|US8185749 *||Sep 2, 2008||May 22, 2012||Apple Inc.||System and method for revising boolean and arithmetic operations|
|US8312297 *||Apr 21, 2006||Nov 13, 2012||Panasonic Corporation||Program illegiblizing device and method|
|US8386803 *||Feb 26, 2013||Apple Inc.||System and method for data obfuscation based on discrete logarithm properties|
|US8429637 *||Sep 2, 2008||Apr 23, 2013||Apple Inc.||System and method for conditional expansion obfuscation|
|US8495390 *||Jan 23, 2013||Jul 23, 2013||Apple Inc.||System and method for data obfuscation based on discrete logarithm properties|
|US8661549||Mar 2, 2012||Feb 25, 2014||Apple Inc.||Method and apparatus for obfuscating program source codes|
|US8689193 *||Nov 1, 2006||Apr 1, 2014||At&T Intellectual Property Ii, L.P.||Method and apparatus for protecting a software application against a virus|
|US8756434||Apr 8, 2011||Jun 17, 2014||Apple Inc.||System and method for executing an encrypted binary from a memory pool|
|US8874928 *||Oct 31, 2008||Oct 28, 2014||Apple Inc.||System and method for obfuscating constants in a computer program|
|US8893272 *||Apr 29, 2011||Nov 18, 2014||Beijing Zhongtian Antai Technology Co., Ltd.||Method and device for recombining runtime instruction|
|US8935539||Apr 20, 2012||Jan 13, 2015||Apple Inc.||System and method for revising boolean and arithmetic operations|
|US8959577 *||Apr 15, 2013||Feb 17, 2015||Cisco Technology, Inc.||Automatic curation and modification of virtualized computer programs|
|US9047448 *||Jan 14, 2013||Jun 2, 2015||Apple Inc.||Branch auditing in a computer program|
|US9116765||Nov 30, 2011||Aug 25, 2015||Apple Inc.||System and method for obfuscating data using instructions as a source of pseudorandom values|
|US20080148066 *||Nov 1, 2006||Jun 19, 2008||Amitava Hazra||Method and apparatus for protecting a software application against a virus|
|US20080313370 *||Nov 24, 2005||Dec 18, 2008||Hong Suk Kang||Guarding Method For Input Data By Usb Keyboard and Guarding System|
|US20090083521 *||Apr 21, 2006||Mar 26, 2009||Taichi Sato||Program illegiblizing device and method|
|US20090235089 *||Mar 12, 2008||Sep 17, 2009||Mathieu Ciet||Computer object code obfuscation using boot installation|
|US20090307500 *||Feb 6, 2007||Dec 10, 2009||Taichi Sato||Program obfuscator|
|US20100058303 *||Mar 4, 2010||Apple Inc.||System and method for conditional expansion obfuscation|
|US20100058477 *||Sep 2, 2008||Mar 4, 2010||Apple Inc.||System and method for revising boolean and arithmetic operations|
|US20100115287 *||Oct 31, 2008||May 6, 2010||Apple Inc.||System and method for obfuscating constants in a computer program|
|US20110116624 *||May 19, 2011||Apple Inc.||System and method for data obfuscation based on discrete logarithm properties|
|US20130276056 *||Apr 15, 2013||Oct 17, 2013||Cisco Technology, Inc.||Automatic curation and modification of virtualized computer programs|
|US20140047222 *||Apr 29, 2011||Feb 13, 2014||Beijing Zhongtian Antai Technology Co., Ltd.||Method and device for recombining runtime instruction|
|US20140201720 *||Jan 14, 2013||Jul 17, 2014||Apple Inc.||Branch auditing in a computer program|
|US20150242192 *||Feb 26, 2015||Aug 27, 2015||Thomson Licensing||Method and system for hardening of cfg flattening|
|CN103299270A *||Apr 29, 2011||Sep 11, 2013||北京中天安泰信息科技有限公司||Method and device for recombining runtime instruction|
|CN103679040A *||Sep 6, 2012||Mar 26, 2014||北京中天安泰信息科技有限公司||Data security reading method and device|
|CN103679041A *||Sep 6, 2012||Mar 26, 2014||北京中天安泰信息科技有限公司||Data security reading method and device|
|CN103679042A *||Sep 6, 2012||Mar 26, 2014||北京中天安泰信息科技有限公司||Data security storage method and device|
|U.S. Classification||726/22, 713/189|
|International Classification||G06F11/30, G06F11/00|
|Cooperative Classification||G06F21/52, G06F21/566|
|European Classification||G06F21/56C, G06F21/52|
|Nov 18, 2005||AS||Assignment|
Owner name: MICROSOFT CORPORATION, WASHINGTON
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:SHELEST, ART;HARTRELL, GREGORY D.;REEL/FRAME:016798/0014
Effective date: 20050930
|Oct 26, 2010||CC||Certificate of correction|
|Mar 18, 2013||FPAY||Fee payment|
Year of fee payment: 4
|Dec 9, 2014||AS||Assignment|
Owner name: MICROSOFT TECHNOLOGY LICENSING, LLC, WASHINGTON
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MICROSOFT CORPORATION;REEL/FRAME:034543/0001
Effective date: 20141014