Search Images Maps Play YouTube News Gmail Drive More »
Sign in
Screen reader users: click this link for accessible mode. Accessible mode has the same essential features but works better with your reader.

Patents

  1. Advanced Patent Search
Publication numberUS20030135794 A1
Publication typeApplication
Application numberUS 10/389,666
Publication dateJul 17, 2003
Filing dateMar 14, 2003
Priority dateJun 18, 1999
Also published asUS6560725
Publication number10389666, 389666, US 2003/0135794 A1, US 2003/135794 A1, US 20030135794 A1, US 20030135794A1, US 2003135794 A1, US 2003135794A1, US-A1-20030135794, US-A1-2003135794, US2003/0135794A1, US2003/135794A1, US20030135794 A1, US20030135794A1, US2003135794 A1, US2003135794A1
InventorsMichael Longwell, William Atwell, Jeffrey Myers
Original AssigneeLongwell Michael L., Atwell William Daune, Myers Jeffrey Van
Export CitationBiBTeX, EndNote, RefMan
External Links: USPTO, USPTO Assignment, Espacenet
Method for apparatus for tracking errors in a memory system
US 20030135794 A1
Abstract
A method for tracking errors in a memory system by detecting an error in a bit of a word accessed in the memory and maintaining an error history comprising a record of each of said detected errors. The error history information may be used to configure the memory, such as to add redundancy; or may be used to adjust operating parameters of the memory, such as the periodicity of refresh and/or scrub operations; or may be used to trigger a sensing operation of other parameters in an application system. In one embodiment, a counter increments each time an error is detected and decrements when no error is detected, thereby tracking error patterns.
Images(3)
Previous page
Next page
Claims(25)
What we claim is:
1. In a memory system having a memory for storing a plurality of words, each comprising a plurality of bits, an error detection and tracking circuit comprising:
an error detection circuit connected to the memory to detect an error in a bit of a word accessed in the memory; and
a tracking unit connected to the error detection circuit to maintain an error history comprising a record of each of said detected errors.
2. The memory system of claim 1 wherein the error detection circuit corrects detected single-bit errors.
3. The memory system of claim 2 wherein the error detection circuit detects multi-bit errors.
4. The memory system of claim 1 wherein the error history includes a record of each of said detected errors over a predetermined tracking window.
5. The memory system of claim 4 wherein the tracking unit asserts an error signal when the number of errors recorded in the tracking window exceeds a predetermined threshold.
6. The memory system of claim 1 wherein said record includes the location in the memory of the word in which said error bit was detected.
7. The memory system of claim 6 wherein said record indicates the nature of said error.
8. The memory system of claim 1 wherein the tracking unit also maintains in said error history a record of each word accessed in said memory in which no error is detected.
9. The memory system of claim 8 wherein the tracking unit periodically discards the oldest of said records in said error history.
10. The memory system of claim 1 wherein the tracking unit selectively forwards said records in said error history.
11. The memory system of claim 1 further comprising:
a refresh controller connected to the memory to periodically refresh each word stored in the memory; and
wherein the tracking unit controls the refresh periodicity as a function of the error history.
12. The memory system of claim 1 further comprising:
a scrub controller connected to the memory to periodically scrub each word stored in the memory; and
wherein the tracking unit controls the scrub periodicity as a function of the error history.
13. The memory system of claim 1 wherein the memory includes a selectively enabled redundant bit; and wherein the tracking unit selectively enables said redundant bit as a function of the error history.
14. The memory system of claim 1 wherein the memory includes a selectively enabled sensor; and wherein the tracking unit selectively enables said sensor as a function of the error history.
15. The memory system of claim 1 further comprising:
a memory controller connected to the memory, the error detection circuit and the tracking unit, to control an access to the memory during an access cycle, at least one parameter of which is controllable; and
wherein the tracking unit controls said parameter as a function of the error history.
16. The memory system of claim 1 wherein the tracking unit selectively disables a selected portion of the memory as a function of the error history.
17. An error detection and tracking circuit for use in a memory system having a memory for storing a plurality of words, each comprising a plurality of bits, the circuit comprising:
an error detection circuit connected to the memory to detect an error in a bit of a word accessed in the memory; and
a tracking unit connected to the error detection circuit to maintain an error history comprising a record of each of said detected errors.
18. The circuit of claim 17 wherein the memory includes a selectively enabled redundant bit; and wherein the tracking unit selectively enables said redundant bit as a function of the error history.
19. The circuit of claim 17 wherein the memory includes a selectively enabled sensor; and wherein the tracking unit selectively enables said sensor as a function of the error history.
20. The circuit of claim 17 further comprising:
a memory controller connected to the memory, the error detection circuit and the tracking unit, to control an access to the memory during an access cycle, at least one parameter of which is controllable; and
wherein the tracking unit controls said parameter as a function of the error history.
21. The circuit of claim 17 wherein the tracking unit selectively disables a selected portion of the memory as a function of the error history.
22. A method for tracking errors in a memory system having a memory for storing a plurality of words, each comprising a plurality of bits, the method comprising the steps of:
detecting an error in a bit of a word accessed in the memory; and
maintaining an error history comprising a record of each of said detected errors.
23. The method of claim 22 further comprising the step of:
selectively enabling a redundant bit in said memory as a function of the error history.
24. The method of claim 22 further comprising the step of:
selectively enabling a sensor as a function of the error history.
25. The method of claim 22 further comprising the step of:
selectively controlling a parameter of an access cycle to the memory as a function of the error history.
Description
    FIELD OF THE INVENTION
  • [0001]
    The present invention relates to integrated circuit dynamic memories, and more specifically to methods of tracking errors in a memory system having detection and correction of errors in a memory.
  • BACKGROUND OF THE INVENTION
  • [0002]
    Systems containing digital electronic components are designed to function correctly over a variety of system parameters and conditions, such as voltage, temperature, etc. System parameters such as bias voltages are typically adjusted by open loop control methods through the use of sensors, such as temperature sensing diodes, and voltage sensors. Similarly, sensors have been used to monitor conditions which initiate a sleep-mode or idle-mode operation. These methods prevent incorrect operation of or damage to many components, particularly memory devices. Many means have been developed to correct “hard” component failures and/or “soft” noise induced loss of data.
  • [0003]
    Various methods have been developed to detect and correct errors in memory. In a Dynamic Random Access Memory (DRAM) redundant columns and rows are added to avoid the use of memory cells exhibiting poor performance. An Error Detection And Correction unit (EDAC) is used to detect errors in stored data, and if possible, correct errors in the data. EDACs greatly improve data integrity. The operation of one type of EDAC is based on a code word. Data to be stored in the memory is provided to the EDAC. The EDAC then generates check bits based on the data value. The check bits are then combined with the data to form a code word. The code word is then stored in the memory. To check the data, the EDAC reads the code word from the memory and recalculates the check bits based on the data portion of the code word. The recalculated check bits are then compared to the check bits in the code word. If there is a match, the data is correct. If there is a difference and the error is correctable, the EDAC provides the correct data and check bits as an output. If there is a difference and the error is uncorrectable, the EDAC reports the occurrence of a catastrophic failure.
  • [0004]
    A variety of EDAC techniques and circuits are available, as are a variety of methods for generating code words and performing bit checks. Some methods are discussed in U.S. Pat. No. 5,598,422, by Longwell, et al., entitled “Digital computer having an error correction code (ECC) system with comparator integrated into re-encoder,” and in Error-Correction Codes, by W. W. Peterson, 2d edition, MIT Press (1972). The information from the EDAC unit is typically used as it is generated. Information such as the address of data that has been corrupted, or the location of failed bits in data are not retained after an EDAC has returned correct data.
  • [0005]
    A need exists to obtain information over time regarding the errors experienced in the memory. There is a need to retain error address and error frequency information to improve hardware reliability and/or data integrity. There is further a need to analyze error information to determine a connection between such failures and parameters of the system.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • [0006]
    The present invention may be more fully understood by a description of certain preferred embodiments in conjunction with the attached drawings in which:
  • [0007]
    [0007]FIG. 1 illustrates in block diagram form a memory system having an error detection and correction unit (EDAC)units and a tracking unit according to one embodiment of the present invention; and
  • [0008]
    [0008]FIG. 2 illustrates in flow diagram form a method for tracking errors in a memory system having an EDAC unit according to one embodiment of the present invention.
  • DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
  • [0009]
    Throughout this description the terms “assert” and “negate” are used when referring to the rendering of a signal, signal flag, status bit, or similar apparatus into its logically true or logically false state, respectively. Similarly, with respect to information or data stored in a memory, a “zero” value is a low potential value, and a “one” is a high potential value.
  • [0010]
    The present invention extends the operating environment in which a memory can reliably store data by providing a means for tracking errors detected by an EDAC within or other error detection and/or correction unit. This information may be employed advantageously to control the operating characteristics of other modules within the memory system. This allows the reuse of DRAM error information to adjust various operating parameters of the DRAM, such as access latency, the refresh cycle parameters, the scrub parameters, the voltage bias of the memory cells, and the like. Additionally, the present invention uses the DRAM error information to initiate sensing and measurement of other system parameters, such as temperature and supply voltage. Still further, the present invention provides a method of using the DRAM error information to adjust the memory configuration through means such as avoiding a defective memory location, and activating redundancy (e.g., redundant rows, columns, code words, or bits). The present invention effectively uses information from the EDAC, such as the address of data that has been corrupted, or the location of failed bits in data, for closed-loop control of system parameters to reduce the risk of catastrophic or non-recoverable system failure.
  • [0011]
    In one aspect of the present invention, in a memory system having a dynamic memory for storing a plurality of words, each comprising a plurality of bits, an error detection and tracking circuit includes an error detection circuit and a tracking unit. The error detection circuit is connected to the memory to detect an error in a bit of a word accessed in the memory. The tracking unit is connected to the error detection circuit to maintain an error history comprising a record of each of said detected errors.
  • [0012]
    In another embodiment, an error detection and tracking circuit for use in a memory system having a dynamic memory for storing a plurality of words, each comprising a plurality of bits, the circuit includes an error detection circuit connected to the memory to detect an error in a bit of a word accessed in the memory; and a tracking unit connected to the error detection circuit to maintain an error history comprising a record of each of said detected errors.
  • [0013]
    In still another aspect of the present invention, a method for tracking errors in a memory system having a dynamic memory for storing a plurality of words, each comprising a plurality of bits, the method includes the steps of detecting an error in a bit of a word accessed in the memory; and maintaining an error history comprising a record of each of said detected errors.
  • [0014]
    While the present invention is described with respect to a DRAM memory system, it is also applicable to other forms of memory, such as non-volatile memory and Static Random Access Memory (SRAM), or any other memory where reliability is desired at a higher level than that intrinsic to the technology.
  • [0015]
    Shown in FIG. 1, in block diagram form, is a memory system 10 constructed according to the preferred embodiment of our invention to reliably support a data processing system (not illustrated). The memory system 10 is generally comprised of a memory controller 12, an EDAC 14, and a DRAM 16. In operation, the memory controller 12 receives memory access requests from an external source (not shown), via a system bus 18.
  • [0016]
    For a write access, each request comprises both data and control information relating to that access, e.g., memory address, timing and sequencing control. Upon receiving a write access request, the memory controller 12 forwards the data to the EDAC 14. From that data, the EDAC 14 generates a code word for storage in the DRAM 16.
  • [0017]
    For a read access, each access request consists of the usual control information. Upon receipt of a read access request, the memory controller 12 instructs the DRAM 16 to provide the appropriate code word to the EDAC 14 for verification. If no errors are detected, or if a single-bit error is detected and corrected, the EDAC 14 provides the requested data to the memory controller 12 for forwarding to the requesting external source (not shown). If a multi-bit error is detected, however, the EDAC 14 signals the memory controller 12, and special procedures may be invoked.
  • [0018]
    Although we have illustrated the EDAC 14 as a single unit, it may be desirable in appropriate circumstances to provide multiple EDACs, such as in the distributed EDAC system we described in our co-pending U.S. Patent Application Number [Attorney Docket No. JMS009-00] entitled “Method and Apparatus for Error Detection and Correction,” by Longwell, et al., filed on Jun. 16, 1999, and assigned to the assignee hereof (the “'009 application”). To facilitate our description of the present invention, we hereby expressly incorporate herein our '009 application by reference.
  • [0019]
    In the preferred embodiment shown in FIG. 1, a refresh controller 20 is provided to periodically refresh the DRAM 16, in coordination with normal read and write accesses controlled by the memory controller 12. To facilitate early detection and correction of errors which might occur in the DRAM 16, a scrub controller 22 is also provided to cooperate with the refresh controller 20 and the memory controller 12 to periodically scrub the DRAM 16. We have described the construction and operation of such a scrub controller 22 in our co-pending U.S. Patent Application Number [Attorney Docket No. JMS008-00], entitled “Method and Apparatus for Refreshing and Scrubbing a Dynamic Memory,” by Longwell, et al., filed on May 18, 1999, and assigned to the assignee hereof (the “'008 application”). To facilitate our description of the present invention, we hereby expressly incorporate herein our '008 application by reference.
  • [0020]
    In accordance with the preferred embodiment of our invention shown in FIG. 1, the memory system 10 includes a tracking unit 24 which monitors or tracks the error information obtained during each read access and scrub cycle, and makes decisions based on this error history. As illustrated in FIG. 1, thethe tracking unit 24 is coupled to the EDAC 14, the DRAM 16, the refresh controller 20, and the scrub controller 22. During operation of the memory system 10, the EDAC 14 provides to the tracking unit 24 status information reflecting the quality of all code words The tracking unit 24 monitors the error information received from the EDAC 14 and keeps a historical record to watch for trends. In oretrieved from the DRAM 16 during either a read access or a scrub cycle. For example, if an error is detected during a particular read access, the EDAC 14 may forward to the tracking unit 24 the address in the DRAM 16 of the code word in error, as well as the nature of that error, e.g., single-bit or multi-bit. In the case of a single-bit error, the EDAC 14 may also provide the bit location in the code word at which the error occurred. On the other hand, if no error is detected, the EDAC 14 may provide information so indicating. In a distributed EDAC system, such as in our '009 application, the status information provided by the EDACs may be combined together and provided simultaneously to the tracking unit 24, or may be time multiplexed over the same conductors, or may be provided over separate conductors from each EDAC.
  • [0021]
    In general, the tracking unit 24 monitors the error information received from the EDAC 14, and maintains a historical record which facilitates early identification of trends. In our preferred embodiment, the tracking unit 24 is coupled to the memory controller 12, and can assert corresponding flags upon detecting certain predetermined memory conditions associated with memory errors. Such predetermined memory conditions generally fall into three categories: memory configuration, memory operation, and system configuration.
  • [0022]
    In accordance with our invention, it is irrelevant which particular unit is assigned the task of making control decisions in response to the historical error information maintained by the tracking unit 24. For example, either the memory controller 12, the tracking unit 24, or the system processor (not shown), may be appropriate in particular instantiations. Accordingly, for purposes of this description, we will assume that the tracking unit 24 is the designated decision maker, and is provided with an appropriately-programmed state machine (see, e.g., FIG. 2).
  • [0023]
    One convenient way to maintain the error information so that is stays “fresh” is to implement the history file as a “leaky bucket” wherein, periodically, the oldest individual record is automatically discarded. On the other hand, in many instantiations, a simple first-in, first-out circular queue of records will suffice. We prefer a combination of the two. In addition, we recommend providing any of several known mechanisms for selectively forwarding the error records to the external system (not shown) for longer-term storage. Performance monitoring system software (not shown) will then be able to provide a more in-depth analysis of the performance of the memory system 10.
  • [0024]
    Now, with respect to memory configuration, the tracking unit 24 may focus on those locations in the DRAM 16 associated with repeating errors, to determine if the number of errors in a particular memory location during a relevant time duration has exceeded a selected threshold, indicating that this bit should be treated as a hard error. In this case, the tracking unit 24 may decide not to use the questionable location for data storage. In a suitably configured memory system 10, the tracking unit 24 may selectively enable a redundant storage element in the DRAM 16, such as a redundant column, row, or bit. Such redundant structures, and their methods of enablement and operation, are well known to practitioners in the field of DRAMs.
  • [0025]
    With respect to memory operation, the tracking unit 24 may identify gradually degrading memory performance, such as might be due to operation in an environment of elevated temperature, poor voltage regulation, higher-than-normal alpha particle hit rates, and the like. Under such conditions, errors may not necessarily exhibit localized error patterns, but may simply occur more frequently. Our experience indicates that such errors often tend to be isolated to some “weak” portion of the DRAM 16, but they may also be distributed more or less randomly throughout a well-balanced DRAM 16. In such cases, the tracking unit 24 may instruct the refresh controller 20 and/or the scrub controller 22 to adjust their cycles accordingly.
  • [0026]
    For example, if errors appear to be increasing in frequency, it may be desirable to increase the frequency of scrub cycles to reduce the cumulative opportunity for catastrophic multiple soft errors. Similarly, it may be decided to increase the frequency of the refresh cycles. Or it may be desirable to use a “refresh with scrub cycle” such as we have described in our '008 application reference.
  • [0027]
    In one embodiment, the memory system 10 is implemented with a deregulated power supply In some commercially-available systems, the memory system 10 is implemented with a deregulated power supply. Such deregulation is often used intentionally to adhere to Federal Communication Commission requirements regarding noise signal transmissions. On the other hand, a deregulated power supply may adversely effect the DRAM 16. In such an instantiation, the error history maintained by the tracking unit 24 may indicate, for example, the need to refresh or scrub the DRAM 16 more often.
  • [0028]
    In accordance with our invention, the timing parameters and maintenance conditions of the DRAM 16 may also be adjusted to accommodate adverse operating conditions, such as temperature variations and changes in voltage bias. In each case, the relevant environmental condition may adversely affect the operation of the memory. One action taken to compensate for degraded performance of the DRAM 16 vis--vis error rate, for example, would be the insertion into each access cycle of one or more extra wait states. Should the operating conditions thereafter improve, the number of extra wait states can be reduced. In a memory system 10 which has been conservatively specified to insert one or more wait states, it may be possible, using our invention, to determine that operating conditions are, in fact, better than predicted at system design time, so that the number of such specified wait states can be reduced. Similarly, access cycle timing, refresh frequency, and scrub cycle scheduling can all be adjusted to compensate for the actual performance of the DRAM 16, whether such performance is better or worse than expected.
  • [0029]
    It is common practice for semiconductor manufacturers to offer a limited range of products, such as DRAMs, each having a respective guaranteed set of performance specifications related to stated operating conditions. However, it is also well known that often, due to normal process variation, a particular integrated circuit will, in fact, perform substantially better than specified, even under the exact operating conditions set forth in the respective specification, including temperature and voltage. In accordance with our invention, such latent performance can be recognized and exploited by appropriate adjustments to the several operating parameters until error rates begins to suffer, and then backing off to return the error rate to the desired level.
  • [0030]
    With respect to system configuration, changes in the performance of the DRAM 16 over time, and compensative actions taken to address adverse changes, may necessitate changing certain operating parameters or characteristics of the system within which the memory system 10 is operating. In this case, memory performance information may provide warning signals to the processor of potential problems in sufficient time to allow corrective actions to be taken before serious degradation in performance occurs. In the event that the memory system 10 takes remedial action which would be visible to the system, such as inserted wait states, more frequent refresh/scrub cycles, or stretched access cycle timing, other system components must adapt accordingly. In some circumstances, the system may elect to enter a low-power mode or otherwise reduce its demands upon the memory system 10. In severe cases, the system may even decide to disable the entire memory system 10, or a selected portion thereof, until the underlying problem can be resolved. For example, the system may instruct the memory system 10 to put a portion of the DRAM 16 into a low-power or “sleep mode”, as described in our co-pending U.S. Patent Application Number [Attorney Docket No. JMS006-00], entitled “Method for Operating an Integrated Circuit having a Sleep Mode,” by Longwell, et al., filed on Apr. 30, 1999, and assigned to the assignee hereof (the “'006 application”). To facilitate our description of the present invention, we hereby expressly incorporate herein our '006 application by reference. The system processor (not shown) may also initiate further checks throughout the system in an effort to determine the root cause. In accordance with our invention, the historical error information maintained by the tracking unit 24 will greatly facilitate early identification of system problems.
  • [0031]
    In one embodiment, the tracking unit 24 may selectively activate a special circuit to assist in the analysis of the problem. For example, as shown in FIG. 1, a sensor 26 is frequently incorporated into an integrated circuit to facilitate testing of relevant operating parameters, such as temperature, bias voltage, etc. Typically, these devices are always “on”, continuously consuming operating power and generating excess heat. Sometimes, they are selectively enabled by system software (not shown) when an operating problem is suspected. In accordance with our invention, the tracking unit 24 can automatically enable the support sensor 26 so as to limit its operation to only those situations in which (and only so long as) the information it provides appears to be pertinent to selecting the most appropriate, alternate action to resolve a specific, detected error scenario. Of course, a suite of such sensors of the various known types may be provided to improve the ability of the tracking unit 24 to efficiently identify and isolate, if possible, the basic cause of the increased error rate so that the optimum response may be implemented in a timely fashion.
  • [0032]
    Shown in FIG. 2 is our preferred method of tracking errors in a memory system constructed in accordance with the preferred embodiment of our invention shown in FIG. 1. Although this method can be easily expanded to track errors detected in the course of normal read accesses, we have found the most natural tracking time interval for detecting error trends is a “scrub sequence”, that is, the time period required to scrub all locations in the DRAM 16 once. Applying this method to the memory system 10, a data structure (not shown) in the tracking unit 24 is initialized (28) prior to the DRAM 16 entering a normal operating mode (30). At a predetermined time thereafter, the memory system 10 performs a scrub operation (32). The data structure in the tracking unit 24 is then updated to reflect the status of the access (34). If no error was detected (36), the memory system 10 returns to normal operation (30). If an error was detected (36) but a predetermined criteria has not been satisfied (38), the memory system 10 also returns to normal operation (30). If the criteria has been met, the tracking unit 24 will assert an error flag (40) to advise all interested components, such as the memory controller 12, the DRAM 16 or the external processor (not shown), of the error event. If the error scenario is one which the tracking unit 24 is capable of correcting (42), the tracking unit 24 implements the appropriate fix (44), and the memory system 10 returns to normal operation.
  • [0033]
    In one embodiment, when the tracking unit 24 is updated (34), a high-order portion of an error counter maintained in the data structure in the tracking unit 24 is incremented each time an error is detected during a scrub cycle. In contrast, each time a scrub cycle detects no error in the scrubbed code word the tracking unit 24 decrements the entire error counter. When this error counter exceeds a predetermined threshold value, too many errors have occurred within a current “tracking window”. In general, the effect of each error has been “spread” over a wider window by incrementing the high-order portion of the error counter but decrementing the full error counter.
  • [0034]
    Another relevant criteria may be the total errors since the last initialization event. Indeed, there may be several criteria, e.g., one for each of several error classes, such as single-bit errors, double-bit errors, etc. If any one of the criteria is met the tracking unit 24 will assert the errors, etc. If any one of the criteria is met the tracking unit 24 flags the error 46 corresponding flag. Flagging the error may involve notifying the memory controller 12 that a decision criteria is satisfied, or may involve taking a specific action, such as initiating a sensing operation. In one embodiment, the tracking unit 24 provides specific locational information to the memory controller 12. Note that the specific action to be taken may be determined heuristically by software (e.g., an expert system) or hardware (e.g., fuzzy logic) in the tracking unit 24 and/or the memory controller 12.
  • [0035]
    The tracking unit 24 may also determine where to provide the information. For example, as illustrated in FIG. 1, the tracking unit 24 communicates directly with the refresh controller 20 and may decide to increase the frequency of refresh cycles. Similarly, based on the error history, the tracking unit 24 may decide to initiate more scrub cycles per unit time, or combine refresh and scrub cycles, or alter the refresh and/or scrub operation to improve the performance of the memory. Note that the tracking unit 24 may also identify good performance and adjust the system parameters to reduce the power consumption of the memory system or increase the time available for normal operation and thus increase the system performance of the memory.
  • [0036]
    of An unusually rapid accumulation of errors may suggest the desirability of monitoring other system parameters, including junction temperature, leakage current, supply potential variations, and the like. Similarly, the error history may suggest the desirability of adjusting various reference voltages within the DRAM 16, such as those associated with capacitor plate voltage, boosted word line voltage, and substrate bias memory.
  • [0037]
    Where the error history provides feedback information for control of the refresh operation, the refresh period may be adjusted to reduce error rates. The feedback information may be limited to the duration between each refresh cycle, or may be cumulative. The advantage of such feedback methodmethod is the direct correlation between deteriorating data integrity and the corrective action, as contrasted with the prior method of using temperature or voltage sensors to indirectly infer loss of integrity.
  • [0038]
    In one other embodiment, the access time for the memory device may be adjusted as a function of the error history. The ability to adjust this timing adds flexibility to DRAM design, allowing the memory to perform over a broader range of operating conditions and in a broad range of environments. Unlike conventional DRAM devices, for which operating environment and conditions are specified, the present invention allows the DRAM to, in effect, dynamically adapt to its surroundings failure.
  • [0039]
    According to the present invention, detected errors are tracked or accumulated and information gleaned from the tracked errors. This information allows adjustment of memory parameters and system parameters. Similarly, measurements of parameters, such as voltage or temperature, whether initiated by the tracking unit 24, other means, or always operational, may be used in an open loop fashion to adjust memory parameters or system parameters.
  • [0040]
    In another embodiment, multiple tracking units or error accumulators may be provided to monitor across-chip parameter variation, providing a finer granularity of information. Individual portions of memory or memory tiles may be handled distinctly. By intentionally biasing a DRAM bit circuit to fail first (i.e., before other bits fail), the memory system may anticipate more widescale problems prior to occurrence.
  • [0041]
    While the present method has been described in the context of a scrub operation, the tracking may be done as an independent operation. Ideally, the tracking will not impact the operating speed of the DRAM device. Similarly, the error history may be evaluated each scrub cycle, or may be evaluated over an integral number of scrub cycles.
  • [0042]
    Thus it is apparent that there has been provided, in accordance with the present invention, a method for tracking errors in a memory system, and providing corrective action accordingly. Those skilled in the art will recognize that modifications and variations can be made without departing from the spirit of the invention. Therefore, it is intended that this invention encompass all such variations and modifications as fall within the scope of the appended claims.
Patent Citations
Cited PatentFiling datePublication dateApplicantTitle
US4438494 *Aug 25, 1981Mar 20, 1984Intel CorporationApparatus of fault-handling in a multiprocessing system
US4964130 *Dec 21, 1988Oct 16, 1990Bull Hn Information Systems Inc.System for determining status of errors in a memory subsystem
US5127014 *Feb 13, 1990Jun 30, 1992Hewlett-Packard CompanyDram on-chip error correction/detection
US5263032 *Jun 27, 1991Nov 16, 1993Digital Equipment CorporationComputer system operation with corrected read data function
US5410545 *Jul 28, 1992Apr 25, 1995Digital Equipment CorporationLong-term storage of controller performance
US5495491 *Mar 5, 1993Feb 27, 1996Motorola, Inc.System using a memory controller controlling an error correction means to detect and correct memory errors when and over a time interval indicated by registers in the memory controller
US5511078 *Nov 18, 1993Apr 23, 1996International Business Machines CorporationMethod and apparatus for correction errors in a memory
US5781918 *Nov 27, 1996Jul 14, 1998Cypress Semiconductor Corp.Memory system and method for selecting a different number of data channels depending on bus size
US5887146 *Aug 12, 1996Mar 23, 1999Data General CorporationSymmetric multiprocessing computer with non-uniform memory access architecture
US5987628 *Nov 26, 1997Nov 16, 1999Intel CorporationMethod and apparatus for automatically correcting errors detected in a memory subsystem
US6101614 *Oct 21, 1997Aug 8, 2000Intel CorporationMethod and apparatus for automatically scrubbing ECC errors in memory via hardware
US6292869 *Aug 31, 1998Sep 18, 2001International Business Machines CorporationSystem and method for memory scrub during self timed refresh
US20010029592 *Jan 25, 2001Oct 11, 2001Walker William J.Memory sub-system error cleansing
Referenced by
Citing PatentFiling datePublication dateApplicantTitle
US6859897 *Mar 2, 2001Feb 22, 2005Texas Instruments IncorporatedRange based detection of memory access
US7324398 *Sep 27, 2005Jan 29, 2008Samsung Electronics Co., Ltd.Memory devices configured to detect failure of temperature sensors thereof and methods of operating and testing same
US7526686Aug 4, 2004Apr 28, 2009International Business Machines CorporationApparatus, system, and method for active data verification in a storage system
US7661045 *Dec 19, 2007Feb 9, 2010International Business Machines CorporationMethod and system for enterprise memory management of memory modules
US7774654 *Mar 20, 2008Aug 10, 2010International Business Machines CorporationMethod and apparatus for preventing soft error accumulation in register arrays
US7793168 *Aug 23, 2007Sep 7, 2010International Business Machines CorporationDetection and correction of dropped write errors in a data storage system
US7800974 *Feb 21, 2008Sep 21, 2010Freescale Semiconductor, Inc.Adjustable pipeline in a memory circuit
US7873883Dec 19, 2007Jan 18, 2011International Business Machines CorporationMethod for scrubbing storage in a computer memory
US7890815Jun 10, 2010Feb 15, 2011International Business Machines CorporationDetection and correction of dropped write errors in a data storage system
US8156382 *Apr 29, 2008Apr 10, 2012Netapp, Inc.System and method for counting storage device-related errors utilizing a sliding window
US8312534Mar 3, 2008Nov 13, 2012Lenovo (Singapore) Pte. Ltd.System and method for securely clearing secret data that remain in a computer system memory
US8539310 *Apr 17, 2009Sep 17, 2013Fujitsu LimitedMemory device and refresh adjusting method
US8621324 *Dec 10, 2010Dec 31, 2013Qualcomm IncorporatedEmbedded DRAM having low power self-correction capability
US8745011Mar 22, 2005Jun 3, 2014International Business Machines CorporationMethod and system for scrubbing data within a data storage subsystem
US8843808Jun 30, 2011Sep 23, 2014Avago Technologies General Ip (Singapore) Pte. Ltd.System and method to flag a source of data corruption in a storage subsystem using persistent source identifier bits
US8873327 *Aug 20, 2013Oct 28, 2014SK Hynix Inc.Semiconductor device and operating method thereof
US9208182Apr 25, 2014Dec 8, 2015International Business Machines CorporationMethod and system for scrubbing data within a data storage subsystem
US9583219Sep 27, 2014Feb 28, 2017Qualcomm IncorporatedMethod and apparatus for in-system repair of memory in burst refresh
US9727413 *Jun 28, 2013Aug 8, 2017International Business Machines CorporationFlash memory scrub management
US20040073845 *Mar 2, 2001Apr 15, 2004Swoboda Gary L.Range based detection of memory access
US20060031722 *Aug 4, 2004Feb 9, 2006International Business Machines CorporationApparatus, system, and method for active data verification in a storage system
US20060077742 *Sep 27, 2005Apr 13, 2006Jae-Eung ShimMemory devices configured to detect failure of temperature sensors thereof and methods of operating and testing same
US20070072588 *Sep 29, 2005Mar 29, 2007Teamon Systems, Inc.System and method for reconciling email messages between a mobile wireless communications device and electronic mailbox
US20070168754 *Dec 19, 2005Jul 19, 2007Xiv Ltd.Method and apparatus for ensuring writing integrity in mass storage systems
US20080313509 *Mar 20, 2008Dec 18, 2008Pradip BoseMethod and apparatus for preventing soft error accumulation in register arrays
US20090055584 *Aug 23, 2007Feb 26, 2009Ibm CorporationDetection and correction of dropped write errors in a data storage system
US20090164842 *Dec 19, 2007Jun 25, 2009International Business Machines CorporationMethod and system for enterprise memory management of memory modules
US20090164855 *Dec 19, 2007Jun 25, 2009International Business Machines CorporationMethod for scrubbing storage in a computer memory
US20090167492 *Feb 27, 2007Jul 2, 2009Entrydata Pty LtdIdentity verification and access control
US20090204752 *Apr 17, 2009Aug 13, 2009Fujitsu LimitedMemory device and refresh adjusting method
US20090213668 *Feb 21, 2008Aug 27, 2009Shayan ZhangAdjustable pipeline in a memory circuit
US20090222635 *Mar 3, 2008Sep 3, 2009David Carroll ChallenerSystem and Method to Use Chipset Resources to Clear Sensitive Data from Computer System Memory
US20090222915 *Mar 3, 2008Sep 3, 2009David Carroll ChallenerSystem and Method for Securely Clearing Secret Data that Remain in a Computer System Memory
US20120151299 *Dec 10, 2010Jun 14, 2012Qualcomm IncorporatedEmbedded DRAM having Low Power Self-Correction Capability
US20140092698 *Aug 20, 2013Apr 3, 2014SK Hynix Inc.Semiconductor device and operating method thereof
US20150006998 *Jun 28, 2013Jan 1, 2015International Business Machines CorporationMemory scrub management
US20170134118 *Jan 24, 2017May 11, 2017Cable Television Laboratories, Inc.Systems and methods for providing resilience to lte signaling interference in wifi
Classifications
U.S. Classification714/42, 714/E11.025
International ClassificationG06F11/07, G11C29/02, G06F11/00
Cooperative ClassificationG11C29/56012, G06F2201/88, G06F11/073, G06F11/0772, G11C11/401, G11C29/02, G06F11/076, G11C29/028, G11C29/50016
European ClassificationG06F11/07P1G, G06F11/07P4B, G11C29/50D, G11C29/56C, G11C29/02H, G11C29/02