US RE41548 E1
An Application-Specific Field Programmable Gate Array (FPGA) device or fabric is described for use in applications requiring fast reconfigurability of devices in the field, enabling multiple personalities for re-using silicon resources (like arrays of large multipliers in DSP applications) from moment-to-moment for implementing different hardware algorithms. In a general purpose FPGA device or fabric, this fast reconfigurability is normally implemented by special reconfiguration support circuitry and/or additional configuration memory. Unfortunately, this flexibility requires a large amount of programmable routing resource and silicon area—limiting the viability in volume production applications. This invention describes how multi-program FPGA functionalities may be migrated to smaller die by constructing hybrid FPGA/ASIC implementations that retain the multi-program capability. Also described is a multi-program FPGA fabric architecture that uses a hybrid FPGA/ASIC interconnect structure, resulting in a much smaller silicon area when customized for a particular user application.
1. A method for constructing an integrated circuit to implement custom logic functionality, comprising:
constructing diffusion and metal layers to construct a generic array of logic cells and field programmable switches, including a plurality of unconnected field programmable switches and unconnected logic cells; and
choosing a netlist corresponding to a particular end-user application to be implemented; and
completing the fabrication of said integrated circuit for the specific initial overall functionality of said particular end-user application by implementing in additional metal layers, a final connection layout with custom hard-wired metal connections between some of said unconnected logic cells and also incorporating custom hard-wired metal connections between some of said unconnected logic cells and said unconnected field programmable switches.
2. The method of
3. The method of
4. An integrated circuit, comprising:
a plurality of field programmable switches and a plurality of logic cells, wherein the field programmable switches and the logic cells are formed in diffusion and metal layers, wherein the number of field programmable switches is less than a number of field programmable switches in an initial design of the integrated circuit, and wherein the difference is the number of field programmable switches omitted from the initial design when forming the integrated circuit;
a first subset of hardwired connections making a first plurality of connections in additional metal layers between ones of a first subset of the plurality of logic cells, wherein the first subset of hardwired connections are introduced to replace at least a first subset of the omitted field programming switches, modifying the initial design; and
a second subset of hardwired connections making a second plurality of connections in the additional metal layers between a second subset of the plurality of logic cells and the plurality of field programmable switches, wherein the second subset of hardwired connections are introduced to replace at least a second subset of the omitted field programming switches, modifying the initial design.
5. The integrated circuit of
6. The integrated circuit of
7. The integrated circuit of
8. A hybrid application-specific and field programmable integrated circuit, comprising:
a plurality of logic elements formed within metal and diffusion layers;
a plurality of hard-wired connections formed in additional metal layers coupling the plurality of logic elements in a pattern according to a netlist to implement a particular end-user application; and
a plurality of field programmable switches, wherein each field programmable switch is formed within the metal and diffusion layers and is coupled to two or more of the logic elements;
wherein the number of field programmable switches is less than a number of field programmable switches in an initial design of the integrated circuit, and wherein the difference is the number of field programmable switches omitted from the initial design when forming the integrated circuit; and
wherein the hardwired connections are introduced to replace at least a subset of the omitted field programming switches, modifying the initial design.
9. The hybrid application-specific and field programmable integrated circuit of
10. The hybrid application-specific and field programmable integrated circuit of
11. The hybrid application-specific and field programmable integrated circuit of
This application claims the benefit of U.S. Provisional Application Ser. No. 60/403,777, filed on Aug. 13, 2002, and entitled “Application specific multi-program FPGA,” commonly assigned with the present invention and incorporated herein by reference.
This application also claims the benefit of is a continuation in part of prior U.S. Nonprovisional application Ser. No. 10/621,957 filed Jul. 16, 2003 entitled “Reprogrammable Instruction DSP” which itself claims priority to U.S. Provisional Application Ser. No. 60/396,375, filed on Jul. 17, 2002, entitled “Reprogrammable instruction DSP with multi-program FPGA fabric”. (recently filed as a utility patent entitled “Reprogrammable instruction DSP” on Jul. 16, 2003) and commonly assigned with the present invention and Application No. 60/396,375 is herein incorporated herein by reference.
Also, the present invention is related to that disclosed in Disclosure Document Ser. No. 522895, filed on Dec. 10, 2002, entitled “Hybrid FPGA” and incorporated herein by reference. It is requested that this document be retained for future reference.
This invention relates to the fields of Programmable Logic Devices (PLDs) including Field Programmable Gate Arrays (FPGAs) as well as custom and semi-custom logic devices, and in particular capabilities for reconfiguration of devices while maintaining acceptable silicon densities, cost, and performance.
Part of the historical vision of programmable hardware, (typically based on some form of FPGA technology), is that the reprogrammable fabric can remain programmable in production. One reason for this vision is that it allows adaptability to future (unforeseen) changes in functional requirements. This can also be extended to enabling changes “on-the-fly”. Fast, on-the-fly reconfiguration can also enable the re-use of FPGA logic functionality for different purposes, from moment to moment and during normal execution, thereby increasing the effective silicon density of the FPGA. Changes on-the-fly allow the personality of the logic to be altered from moment-to-moment as different algorithms are required for different tasks, sometimes altering the personality in as little as a clock or two. The faster the FPGA can be reconfigured, the more its resources can be utilized for more than one user function, and the more the effective density is increased. This fast reconfiguration for increasing effective silicon density is especially useful in DSP (Digital Signal Processor) applications where many large multiplier functions are typically required, but if connected differently from moment to moment, can be re-used to implement different algorithms as required.
Conventional FPGA devices like those manufactured by Xilinx and Altera have been enhanced to allow somewhat faster reconfiguration and/or also partial reconfiguration. Also, some FPGA fabrics (the basic logic array structure) for use as IP (Intellectual Property) Cores in System On Chip (SOC) designs have been designed with provision for very fast full and/or partial configuration. However, these enhancements usually do not allow for major functionality changes within a clock or two. Even so, FPGA fabric providers like Elixent and Adaptive Silicon see their fast reconfiguration capability as valuable for re-using FPGA logic for different algorithms in the same application. Also, companies like PACT and GateChange see their fast partial-reconfiguration capability as useful for changing functions in real-time in a pipelined manner, so that the FPGA function can be altered as data propagates through the device. Chameleon offers a device that contains a full shadow memory for fast reconfiguration in a clock or two.
Unfortunately, the FPGA fabrics typically used in these solutions consume between 20 and 40 times as much silicon area as a standard-cell ASIC implementation normally used in high-volume SOC design. Very fast reconfiguration capability that is implemented without adding large amounts of additional memory requires an FPGA fabric architecture that has additional silicon area allocated to fast reconfiguration bus structures and sometimes additional memory to cache some of the reconfiguration data. Further, if it is desirable to alter the function of the FPGA fabric on-the-fly and within a clock cycle or two, additional configuration memory must be included in the FPGA fabric to implement the “multi-program” capability, increasing the consumption of silicon area even more.
Today, it remains to be seen if the value of full reprogrammability is economically viable for high-volume designs. The same is true for fast-reconfiguration FPGAs for multi-program implementations where full reprogrammability is retained for each personality—regardless of whether additional configuration memory is included or not. The silicon area penalties of retaining the capability for FPGA fabrics to implement any arbitrary functions are too great for most applications with any significant production volume.
There may come a time where fully-programmable multi-program (fast reconfiguration and/or multi-program memory) FPGA fabrics may become viable for SOC and FPGA volume production. However, in the meantime, there is a need for solutions that take advantage of the flexibility benefits of FPGA technology, while also providing an effective and practical solution for volume production. The promise of fast reconfiguration in FPGA fabrics for the purpose of re-using silicon resources (like arrays of large multipliers in DSP applications) may be fulfilled with acceptable device cost if the fabric can be tailored to the application.
Also, given the realities for very deep submicron design and the opinion of some experts that Moore's law (for semiconductor density and performance over time) is breaking down, it would be especially valuable if a device architecture were available that can implement multi-program functionality for a particular customer application, with acceptable silicon area for volume production, while requiring a limited number of custom masks for personalization.
An Application Specific Field Programmable Gate Array (FPGA) device or fabric is described that is intended for use in applications requiring very fast reconfigurability of devices in the field, such that this FPGA fabric can effectively exhibit multiple personalities from time-to-time during normal use. These multiple personalities are especially valuable in re-using silicon resources (like arrays of large multipliers in DSP applications) from moment-to-moment for implementing different hardware algorithms.
In a general purpose FPGA device or fabric, this fast reconfigurability can be implemented by special reconfiguration support circuitry and/or additional configuration memory. Unfortunately, maintaining the capability for the FPGA to implement any arbitrary function for each personality requires a large amount of programmable routing resource and silicon area—limiting the viability in volume production applications.
This invention describes how multi-program FPGA functionalities may be migrated to smaller die by constructing hybrid FPGA/ASIC implementations that retain the multi-program capability. Also described is a multi-program FPGA fabric architecture that uses a hybrid FPGA/ASIC interconnect structure, resulting in a much smaller silicon area when customized for a particular user application.
The present invention is described with respect to particular exemplary embodiments thereof and reference is accordingly made to the drawings in which:
The general topic of reprogrammable hardware goes beyond simply using FPGA-style re-programmable logic to implement a function. In some implementations that have been described, it may be desirable to alter the device's functionality during operation—effectively creating a multi-program FPGA or FPGA fabric. Such dynamic alteration allows the hardware resources of the device to be used for a first functionality at one point in time, and a second functionality a few moments later, thus increasing the overall effective functionality without a proportionate increase in the device size. To successfully implement this re-use of hardware functionality, a very fast re-configuration time is generally required. This fast reconfiguration time may be implemented using special reconfiguration bus structures and support circuitry and/or additional configuration memory.
If an adequately fast reconfiguration time can be achieved utilizing special reconfiguration support circuitry, this is preferable to adding large amounts of additional configuration RAM. Special reconfiguration support circuitry typically includes an optimized reconfiguration bus structure and control circuit and sometimes some amount of cache RAM for burst transfer of configuration data.
When multiple configuration RAM images must be stored in a multi-program FPGA, this may be implemented with an SRAM structure where a primary RAM cell controls the current configuration of a connection point or logic function and a “shadow” RAM cell or cells can be loaded with alternate pattern(s) to be transferred into the primary configuration RAM very quickly—sometimes within a clock periods of two. Alternately, the configuration RAM may simply have multiple locations per controllable connection point or logic function control bit.
If the implementation for
Note that this multi-program FPGA fabric, which will be described in more detail, can exist independently of any specific fixed functions 105, and could actually be embodied in an FPGA device containing only the multi-program FPGA fabric, or in an FPGA device also containing fixed functions such as memory and processor, or alternately may be embedded in a SOC ASIC as an independent IP core, or combined in some way with a conventional software programmable DSP processor.
In the initial, fully programmable implementation, FPGA logic Cells 201 and 202 as well as FPGA Routing 203 contain the full flexibility of the FPGA technology utilized. Note that large grain cells 202 could represent or be used to implement multipliers. Configuration memory cells 204 contain the full complement of cells required to support all possible functionalities of the FPGA in two or more program configurations. Although this specification often utilizes a paradigm of multiple memory cells per FPGA programming/connection point, it should be understood that multiple FPGA programs may instead be implemented by employing special bus and circuit structures to enable fast reconfiguration where single memory cells control FPGA connection points and other programmable functions.
To achieve a lower device cost for higher volume production and/or to achieve lower power consumption, a specific multi-program user design may be migrated according to this invention to a Hybrid FPGA/ASIC implementation, where the required configuration memory cells are retained in order to implement each specific application (hence—“application specific”) including the multi program (multi-personality) capability. Any connection points or logic functions that need not be programmable are deleted or hard-wired as appropriate, their configuration memory control cells also being deleted. Thus, FPGA Cells 207 and 208 may be reduced in size or otherwise simplified, and FPGA routing 209 will now comprise a combination of FPGA programmable routing connection points and hard-wired ASIC connections—a unique form of hybrid FPGA/ASIC. FPGA configuration memory cells 210 will now consist of a much smaller number of cells since only those actually required to implement the specific multi-program applications will remain. All other (unnecessary) cells have been deleted in the process of performing this migration. Note that any fixed functions 205 and 211 are usually identical in both the fully program multi-program FPGA fabric and the higher density, application-specific hybrid FPGA/ASIC fabric. These fixed functions may include, for example, software programmable (e.g. RISC or DSP) processors, memory (RAM and/or ROM), I/Os, PLLs, etc.
Note that this multi-program FPGA fabric and the migration method shown for increasing functional density can be employed in conjunction with any other functions fixed or programmable, or alternately can be implemented as a standalone FPGA device or an FPGA fabric embedded in an SOC (System on Chip) design—in all cases using the method described herein to migrate the multi-program FPGA fabric to a lower cost, higher density, higher performance implementation.
To further demonstrate the method of this invention for migrating a specific multi-program FPGA design to a hybrid FPGA/ASIC implementation, it is first useful to define an example FPGA interconnect matrix like the one shown in FIG. 3. FPGA interconnect matrices can be constructed in a variety of styles including a variety of connection point de-population schemes.
The multi-program RAM of
Although the method described here focuses on the FPGA interconnect (which normally dominates silicon usage in a typical reprogrammable FPGA by a factor of approximately three to one over the logic cells), a similar scheme may be implemented within the FPGA logic cells, if those cells contain reprogrammable functionality. Some FPGA logic Cells, like well-known look-up table (LUT) are highly programmable and could be implemented as multi-programmable by substituting multiple location RAM blocks where a single RAM cell is normally used for configuration. In other implementations, some or all FPGA logic cells may have a fixed functionality such as a multiplexer or a multiplier. Other implementations may have a mixture of some logic cells that are programmable and some that are not. FPGA logic cells may also be quickly reprogrammed in some implementations by fast reconfiguration busses and structures.
To further describe how a multi-program FPGA implementation can be migrated to a hybrid FPGA/ASIC implementation according to this invention, it is appropriate to describe a simple example application with two different programs or personalities, such as that shown in FIG. 4.
If it is anticipated that a multi-program FPGA design might possibly be migrated to a hybrid FPGA/ASIC implementation, it is advantageous to first achieve a routing pattern for program A that has as many selected connection points as possible in common with those required for program B. This way, when the routing patterns for program A and program B are consolidated, the most efficient merging of resources will result. Although this is desirable, it is not required, and routing configurations for programs A and B with fewer connection points in common may be preferable if they result in higher performance.
The consolidation process requires identifying programmable connection points 406 that are common to both programs—essentially always requiring a connection to be made regardless of the program implemented. These can later be eliminated and turned into hard-wired connections. Also, programmable connection points 404 that are utilized only by program A, and programmable connection points 405 which are utilized only by program B, are identified. These must be retained as programmable connections in the consolidated implementation. All other programmable connection points 407, that are not utilized by either program are identified, and eliminated, in the consolidated implementation, representing the hybrid FPGA/ASIC.
The first-step, 601, is optional, and describes that the FPGA routing patterns for programs A and B in the initial implementation should have as many programmable connection points as possible in common. This can be accomplished in a variety of ways when the routing software is executed for programs A and B. One method would be to create a routing pattern for program A and then use this routing pattern as a starting point to create the pattern for program B. Some variation on a “rip-up and re-try” algorithm may be utilized here. Then, one could create a routing pattern for program B and then use this routing pattern as a starting point to create the pattern for program A. The results for the two exercises can then be compared, with that exhibiting the greatest number of common programmable connection points being kept as the preferred patterns. Of course the required performance and capacity requirements for the initial reprogrammable implementation must also be taken into account here.
In order to consolidate multiple programs such that they may be retained in the hybrid FPGA/ASIC, the next step 602 is to identify connection points that are common to all programs and define these as non-programmable, solid connections in the hybrid FPGA/ASIC netlist. Then, in step 603, connection points that are used for some, but not all programs are identified to be retained as programmable in the hybrid FPGA/ASIC netlist. Finally, in step 604, connection points are identified that are not used for any defined program, and these are eliminated from the hybrid FPGA/ASIC netlist. Typically this last category will comprise the majority of the connection points in the initial FPGA structure and will therefore account for the largest amount of silicon area reduction after the specific design has been migrated to the hybrid FPGA/ASIC implementation. The physical layout for an FPGA connection matrix is normally very regular. In the hybrid FPGA/ASIC implementation just described, the remaining programmable connection points, after de-population, may still be layed-out in a regular array if a custom or semi-custom layout is to be generated for the hybrid FPGA/ASIC implementation. Although the irregular wiring patterns required to connect the remaining programmable connection points may result in some loss of silicon area efficiency, the relatively large number of deleted, unused connection points will make the hybrid FPGA/ASIC a significantly smaller die nonetheless.
In addition to thinking of the Hybrid-FPGA solely as a way of merging multiple specific FPGA designs in order to programmably implement any of them, an additional step can be performed to add additional programmable routing switches such that variations in the design that are not yet known may be made after fabrication. This is especially useful for the semi-custom approach to be described in
Notice that in a Hybrid FPGA, it is not only switches that may be left uncommitted. Logic modules (cells) may also be left uncommitted. This can allow additional, currently unknown functionality to be added at a later date. In effect with this architecture, a specific logic functionality may be initially implemented in ASIC fashion using some portion of the modules and hard-wired metal connections, but if changes are needed after construction, they may be implemented in the field by using some of the uncommitted logic modules and field programmable switches. This can allow some degree of ASIC “Bug fixes” to be performed in the field.
An alternative to performing an all-layer custom implementation for the aforementioned hybrid FPGA/ASIC is shown in
Notice that programmable interconnect fabric 808 in
Therefore, methods and apparatus for implementing an application-specific multi-program FPGA fabric or device, have been described.
It should be understood that the particular embodiments described above are only illustrative of the principles of the present invention, and various modifications could be made by those skilled in the art without departing from the scope and spirit of the invention. Thus, the scope of the present invention is limited only by the claims that follow.