US H1741 H
A method and apparatus for stress testing a computer memory system is disclosed. A sequential series of bit patterns, comprising 1's and 0's, is impressedd upon a computer memory so that, during testing, every memory cell stores a 1 while the eight adjacent neighboring memory cell stores 0's. Subsequently, the complimentary bit patterns are impressed upon memory, wherein every memory cell, at some time during the test, stores a 0 while the eight immediately adjacent neighboring memory cells store 1's. The disclosed bit pattern maximizes stress on the cells, In an interleaved dual memory bank configuration, memory cells are sequentially accessed from the highest to the lowest address of one memory bank, while memory cells are sequentially accessed from the lowest to the highest memory address in the other memory bank. Toggling successive memory accesses between the dual interleaved memory banks maximizes stress on the address driver components of the computer memory system. Preferably, the dual memory banks are simultaneously tested with complementary bit patterns to also maximize stress on the data bus driver and associated components and to maximize demand on the power supply system.
1. A method for stress testing a memory system having a plurality of memory cells capable of being grouped in a plurality of neighborhoods of nine cells, each neighborhood of nine cells including a center cell and eight cells surrounding the center cell, the method comprising the steps of:
(a) writing all of the memory cells within a predetermined address range of the memory system with a first value;
(b) writing a plurality of target cells with a second value which is a complement of the first value, each of the target cells being a center cell of each of a first group of neighborhoods of nine cells within the address range, each of the target cells being surrounded by eight non-target cells;
(c) reading all of the non-target cells within the address range and reporting a failure condition if any non-target cells contain other than the first value; and
(d) reading all of the target cells and reporting a failure condition if any target cells contain other than the second value.
2. The method of claim 1, further comprising the step of repeating the steps (a) to (d) until all of the cells within the address range have been written as target cells of a plurality groups of neighborhoods of nine cells.
3. The method of claim 2, further comprising the step of repeating the steps (a) to (d) with the second value written to all of the plurality of memory cells when the step (a) is performed and the first value written to the plurality of target cells when the step (b) is performed.
4. A system for stress testing a computer memory system having a plurality of memory cells arranged in a matrix pattern, comprising:
means for writing to a target cell within a neighborhood of 9 cells;
first means for reading the target cells 8 nearest neighbors, detecting a disturbance in the target cells 8 nearest neighbors and determining if the write to the target cells caused the disturbance;
second means for reading the target cell and determining whether a disturbance occurred in the target cell while the first means for reading read the 8 nearest neighbor cells.
5. A system for stress testing a memory system having a plurality of memory cells capable of being grouped in a plurality of neighborhoods of nine cells, each neighborhood of nine cells including a center cell and eight cells surrounding the center cell, the system comprising:
first writing means for writing all of the memory cells within a predetermined address range of the memory system with a first value;
second writing means for writing a plurality of target cells with a second value which is a complement of the first value, each of the target cells being a center cell of each of a first group of neighborhoods of nine cells within the address range, each of the target cells being surrounded by eight non-target cells;
first reading means for reading all of the non-target cells within the address range and reporting a failure condition if any non-target cells contain other than the first value; and
second reading means for reading all of the target cells and reporting a failure condition if any target cells contain other than the second value.
6. The system of claim 5, further comprising means for causing the first and second writing means and the first and second reading means to repeat performing the writing and reading functions, respectively until all of the cells within the address range have been written as target cells of a plurality groups of neighborhoods of nine cells.
7. The system of claim 6, further comprising means for causing the first and second writing means and the first and second reading means to repeat performing the writing and reading functions, respectively with the first writing means writing the second value to all of the plurality of memory cells and the second writing means writing the first value to the plurality of target cells.
8. A computer program product comprising:
a computer usuable medium having computer readable code embodied therein for stress testing a memory system having a plurality of memory cells capable of being grouped in a plurality of neighborhoods of nine cells, each neighborhood of nine cells including a center cell and eight cells surrounding the center cell, the computer program product comprising:
computer readable program code devices configured to cause a computer effect performing a first writing function by writing all of the memory cells within a predetermined address range of the memory system with a first value;
computer readable program code devices configured to cause a computer effect performing a second writing function by writing a plurality of target cells with a second value which is a complement of the first value, each of a first group of target cells being a center cell of each of a first group of neighborhoods of nine cells within the address range, each of the target cells being surrounded by eight non-target cells;
computer readable program code devices configured to cause a computer effect performing a first reading function by reading all of the non-target cells within the address range and reporting a failure condition if any non-target cells contain other than the first value; and
computer readable program code devices configured to cause a computer effect performing a second reading function by reading all of the target cells and reporting a failure condition if any target cells contain other than the second value.
9. The computer program product of claim 8, further comprising computer readable program code devices configured to cause a computer to effect repeating performing the first and second writing functions and the first and second reading functions until all of the cells within the address range have been written as target cells of a plurality groups of neighborhoods of nine cells.
10. The computer program product of claim 9, further comprising the computer readable program code devices configured to cause a computer to effect repeating performing the first and second writing functions and the first and second reading functions, with the second value written to all of the plurality of memory cells and the first value written to the plurality of target cells.
This application includes appendices entitledd: Appendix A--"Source Code for Pattern Sensitivity Testing" including 12 pages, and Appendix B--Pseudo-Code for Pattern Sensitivity Test Process, comprising 1 page.
A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyrights whatsoever.
1. Field of the Invention
The present invention relates to the field of testing the integrity and reliability of computer memory systems. More particularly, the present invention relates to stress testing modern high speed interleaved computer memory systems for active neighborhood pattern sensitivity faults ("ANPSF") and passive neighborhood pattern sensitivity faults ("PNPSF").
2. Description of Related Art
Solid state random access memory ("RAM") is one of the most important components of modern high speed computers. As operating systems and application programs become increasingly complex, they demand more and faster RAM. Therefore, much engineering effort has been directed toward increasing the storage capacity of RAM (i.e., increasing the number of storage cells in a RAM chip) while simultaneously decreasing RAM access times and physical size.
Unfortunately, however, as the number of memory cells within a given memory chip increases, so does the probability that the chip may contain a non-functioning cell. If the operating system, for example, encounters a malfunctioning memory cell, computer operation may be halted and critical data may be lost. Failures in the components of memory address drivers may have similar negative consequences. Therefore, there exists a need for efficient and effective methods to locate problems with RAM memory and associated address driver components during the computer manufacturing process, so as to eliminate, or at least minimize, memory malfunctions during normal customer operation.
In an effort to locate non-functioning components of address drivers and memory storage cells, and components and cells which are prone to fail during normal computer operation, test engineers have developed tests which subject such components to stresses in excess of that which would be expected during normal operation. For example, memory test algorithms which access memories with converging addresses stress the components that address the storage cells of the memory chips. A typical test of this type makes its first access to the lowest memory address within the test range, the second memory access to the highest memory address within the test range, the third access to the next to lowest address, the fourth access to the next highest and so forth. Eventually, the addresses being accessed converge at the middle of the physical memory. This method of stepping the chip addressing back and forth while the memory is under test is often referred to as "butterfly addressing." Butterfly addressing algorithms in the past were typically applied to March test algorithms. March test algorithms perform read/write/read operations on each memory location as they progress through memory. Alternating the addresses in this fashion maximizes the number of 0 to 1 and 1 to 0 transitions on the memory address lines, and therefore maximizes associated power consumption and concomitant stress on the components that access the memory cells.
As the physical size of RAM memory decreases, charge leakage between adjacent memory cells becomes the primary failure mode. Therefore, creating stress on the inter-cell physical paths between near neighbor cells is a important goal of test engineers. However, because of the extremely large number of different combinations of static and dynamic states that can exist among and between the eight cells immediately adjacent a target cell under test, such tests have been impossible to completely realize without using algorithms for addressing the memory that degrade raw performance of a memory test to the point that the time required to execute the tests cannot be justified for the benefits gained. The article by Magdy S. Abadir and Hassan K. Reghbati, Computing Surveys, Vol 15, No. 3, September 1983, describes some of the requirements for such memory pattern sensitivity testing.
IEEE Transactions on Computers, Vol. C-26, No. 11 (November 1977), pp. 1141-1144 by Knaizuk and Hartmann, discloses a test algorithm that exploits pattern sensitivity testing in a "neighborhood of five." The term "neighborhood of five" refers to the total number of memory cells involved in a test adjacent to and including a particular target cell. In a grid of memory cells, a neighborhood of five includes the target cell, the cells immediately above and below the target cell and the cells immediately to the sides of the target cell. Neighborhood of five testing, however, fails to test for charge leakage between the target cell and diagonally adjacent cells.
With today's memory chips achieving ever increasing higher density, the possible leakage paths between meory cells becomes shorter and shorter. As the memory cell density increases, the probability of the most common memory fault becoming a leakage path to a neighboring memory cell increases. Thus, testing of the "neighborhood of nine" memory cells for leakage between all adjacent cells, including diagonally adjacent cells, becomes very important.
This invention provides methods and apparatus for pattern sensitivity and address stress testing og memory systems. The invention provides a test procedure that accomplishes a complex but efficient "neighborhood of nine" pattern sensitivity test. The term "neighborhood of nine" refers to the total number of nearest neighbor memory cells tested immediately adjacent to and including a target cell. Simultaneously, the invention stresses the system with an address stress test which produces the stress of "butterfly addressing", but in a distinctly different manner. For the purpose of this disclosure (and to avoid confusion with prior art techniques) the inventive address stress testing technique described hereinafter will be referred to as "pseudo-butterfly addressing". "Butterfly addressing" refers generally to a procedure wherein successive memory accesses are made to widely disparate physical locations within a computer memory system.
If desired, the invention may be used to separately or simultaneously:
(1) Provide address driver and system power stress testing;
(2) Accomplish complete neighborhood of nine pattern sensitivity testing to assure that there are no passive neighborhood pattern sensitivity faults;
(3) Detect active neighborhood pattern sensitivity faults without jeopardizing the viability of the test algorithm from excessive complexity or long run time; and
(4) Provide a test algorithm uniquely suited to a multi-bank interleaved computer memory system.
This invention may be used to test for both Passive Neighborhood Pattern Sensitivity Faults (PNPSF) and Active Neighborhood Pattern Sensitivity Faults (ANPSF). ANPSF refers to faults which result from a change in the state of a memory cell. PNPSF refers to memory cell malfunctions which occur because of a particular static memory pattern impressed upon a memory chip.
The inventive neighborhood of nine pattern sensitivity test stresses a given target cell with eight neighbor memory cells that are in the opposite state, so as to provide the maximum charge differential between the target cell and the eight nearest neighbor adjacent cells, and thereby the greatest opportunity for charge leakage.
For a method viewpoint, the invention involves the following steps:
(1) Write to the target cell to see if a write operation to that cell is possible, and that it does not create a disturbance (i.e., a change in state) in any of the eight nearest neighbor cells:
(2) Read the target cell's eight nearest neighbors to detect any disturbance caused by the write operation to the target cell and also to test the possibility that reading a neighbor cell might disturb the target cell; and
(3) Read the target cell to verify that no disturbance occurred while reading the target cells eight nearest neighbors.
From a first alternative method viewpoint, the invention involves the following steps:
(1) Write a specific pattern to all eight neighbor cells in a neighborhood of nine cells;
(2) Write all target cells to a compliment pattern (where 0 is the complement of 1 and 1 is the complement of 0);
(3) Read all neighborhood cells to see if the target cell writes created any disturbance in the neighborhood;
(4) Read all target cells to see if reading the neighbor cells disturbed any target cells; and
(5) Repeat each of steps 1-4 until all possible cells of a neighborhood of nine cells become a target cell, thereby testing for the ability of each cell to hold either a 0 or a 1 while the maximum charge disturbance is created in the neighborhood of each such target cell.
(6) Repeat steps (1) through (5) with the complimentary pattern of 0s and 1s.
The invention requires 12 iterations of steps 1 through 4 above to completely test all static pattern combinations.
From a third alternative method viewpoint, when the invention is used in a memory system having multiple (for example, two) banks of interleaved physical memory units, the test sequence in the pattern sensitivity stress test for interleaved memory includes the following steps:
(1) Write all memory including both target (foreground) cells and neighbor (background) cells to the same state to validate memory;
(2) Write the target cell of each modulo 6 frame to attempt to disturb its nearest eight neighbors. The writes are accomplished during a scan of one memory bank from the lowest target memory cell address ascending to the highest target memory cell address. The other memory bank being similarly scanned from the highest to lowest target cell address, while alternating memory accesses between memory banks. (Note that this step only requires writing one sixth of the memory under test);
(3) Read the eight neighbor (background) cells immediately adjacent to each target (foreground) cell, scanning one memory bank from the lowest memory address ascending to the highest memory address, alternating between accesses to each memory bank, the other memory bank being scanned from the highest memory address descending to the lowest memory address. Toggling accesses between the two memory banks using this pseudo-butterfly addressing technique maximizes stress on the address drivers and the system power source. Unlike the prior technique, however, the addresses do not converge in the middle of a single physical memory chip. This read of the entire memory is done to detect any cell disturbance in either the target (foreground) or neighbor (background) cells. During this time the invention may, if desired, utilize error correction code (ECC) to detect single bit correctable errors in a manner known in the art. In all cases target (foreground) and neighbor (background) cells should preferably contain data that is a complement of the other;
(4) Restore the target (foreground) cells to the same pattern as the neighbor (background) cells. Again, this only requires writing one sixth of the memory under test. If the sixth cell of the modulo 6 memory frame is currently the target, the entire memory is written to the complement of the neighbor (background) state;
(5) Sequence to the next sequential cell within the modulo 6 memory frame or start over with the complementary test pattern if all six cells of the modulo 6 memory frame have been tested; and
(6) After sequencing to the next cell target, perform the preceding steps 2 through 5.
All six cells of the modulo 6 memory frame are tested in the above manner twice, once to test with target cells as 1's, and once to test target cells as 0's. Therefore, twelve iterations of the above process are needed to target each cell both in a neighborhood of 1's and in a neighborhood of 0's.
The invention will now be described with reference to the accompanying drawings, wherein:
FIG. 1 illustrates a computer memory system, including a memory controller having an address driver, data bus driver and two interleaved solid state RAM memory chips.
FIGS. 2-7 sequentially illustrate the bit pattern progression in a small 16×16 memory cell array using the neighborhood of nine test sequence, wherein the 1 bits are the "foreground" or target bits and the 0 bits form the "background" or non-target bits.
FIGS. 8-13 illustrate the complimentary bit pattern progression in the same small 16×16 memory cell, wherein the 0 bits comprise the "foreground" or target bits and the 1 bits comprise the "background" or non-target bits.
The following description is of the best presently contemplated modes of carrying out the invention. This description is made for the purpose of illustrating the general principles of the invention and is not to be taken in a limiting sense.
FIG. 1 illustrates the architecture of a computer memory system 10 utilizing interleaved dual memory banks 12, 14. In accordance with a method known to those trained in the computer memory field, a memory controller 15 including an address driver 16 transmits signals along address lines 18 in response to instructions from CPU 20. As memory cells 22 are addressed, in accordance with signals from address driver 16, each memory cell 22 is read from and written to via signals transmitted along the data bus 21 from the data bus driver 24 also in a manner known to those trained in the computer memory field.
Memory accesses require a certain fixed set-up time and, therefore, many high speed computer memory systems now use "interleaved memory," wherein sequential memory accesses occur first to one memory bank 12 then to the other memory bank 14. In this way, memory access times are effectively decreased since a memory read or write operation can take place in one memory bank 12 while the other memory bank 14 is being set up for the next succeeding memory access.
FIGS. 2-13 illustrate a preferred embodiment of the present invention operative in a small 16×16 memory celll array. The 16×16 array is used for illustrative purposes; practical computer memories, of course, tend to be significantly larger.
In the 16×16 cell array illustration, FIGS. 2-13 show the twelve sequential patterns written into the array. One can readily see how the eight neighbors of a given target cell are in the opposite state from the target cells. The cells containing 1's are totally surrounded by 0's, even in locations that are on diagonals from the target cells. That is, each of eight neighbors in the neighborhood of nine cells are of the opposite state from the target cells. With the twelve patterns of the test, each cell in the array in turn becomes surrounded with cells of the opposite state, for both the 0's and 1's cases. The illustrations of FIGS. 2-13 shows the contents of a bit slice one bit wide in one bank of the memory under test. A 16×16 memory was chosen for illustration ease, although the same principle applies to memory arrays of any size that are addressed modulo two.
The algorithm specifically described herein is implemented for a two bank interleaved memory system, as shown in FIG. 1, but is readily adaptable to non-interleaved memory systems or memory systems with interleave greater than two.
Passive Pattern Testing:
As illustrated b the vertical dashed lines in FIG. 2, a neighborhood of nine test scenario can be developed in a memory system that is addressed modulo 2 by subdividing memory horizontally into groups of cells modulo 6. A modulo 6 cell group may be referred to in this description of the test algorithm as a "frame". The specific cell under test is referred to as the target cell.
With reference to FIG. 2, the initial test pattern puts a 1 into the first target cell of each frame and 0's into each of the other five cells of the frame. Imaginary squares 32 illustratively surround three neighborhoods of nine adjacent memory cells. To cover the possibilities of each cell within a frame being a 1 with the other eight cells being 0's requires storing six different pattern sequences into memory. FIG. 3-7 illustrate each of these subsequent sequences. Six additional sequences, illustrated in FIGS. 8-13, are required to make each cell within a frame be a 0 with 1's in the other five cells of the frame. A total of twelve test sequences, with the unique patterns illustrated in FIGS. 2-13 collectively, are required to make each of the cells of the "neighborhood of nine" become a target cell holding a 0 and a 1.
The process for impressing the various illustrated memory patterns on to the memory banks 12,14 is driven by a software program which runs on a CPU 20 (FIG. 1). The instructions comprising this program are preferably stored in cache memory 34 separate from the memory 12,14 under test. The CPU 20 provides instructions to the address driver 16 and to a data bus driver 24 to cause memory I/O to be performed on each of the memory cells 22 under control of the software program.
A presently preferred program for accomplishing the memory test process disclosed herein is contained in Appendix A attached hereto and incorporated herein by reference. This program is written in C computer programming language and will be readily understood by one of ordinary skill in the relevant technological field. To further elucidate a presently preferred embodiment of the subject invention, Appendix B, which is also attached hereto and incorporated herein by reference, contains pseudo code for the stress testing process.
As implemented by the software program of Appendix A, the memory test procedure includes the major steps of:
(1) Writing all 0's into both memory banks:
(2) Write a 1 into each group of modulo 6 memory cells in one of the memory banks according to the bit pattern illustrated in FIGS. 2-7 and in a first direction of memory addresses (e.g., increasing memory addresses) and write the same bit pattern into the other memory bank, except in the reverse direction of memory addresses;
(3) Truncate the bit pattern at the end of memory if the end of memory does not coincide exactly with the end of a pattern sequence;
(4) Read all of memory and compare the originally written pattern with the pattern as now read;
(5) If any miscompares are located, the software logs an error so that appropriate action may be taken. Depending upon the circumstances, and severity of the problem, the test technician may replace the memory chip, address driver or other failed component, or the CPU may be instructed to avoid the page of memory containing the failed memory cell.
(6) Write one-sixth of memory by overwriting all 1's with 0's.
(7) Re-write the original pattern of 1's and 0's into memory, but in this step the modulo 6 bit pattern is shifted by one memory cell. (Again ony one-sixth of memory is written, thereby making this process exceptionally fast).
(8) Read the entire memory looking for miscompares between the pattern written in step 7 above and the pattern read in this step 8 and report any problems for appropriate action.
(9) Repeat steps 2-8 until all memory cells are written with a 1.
(10) Repeat steps 1 through 9 with the complementary bit pattern.
(11) Alternate all memory accesses between the two interleaved memory banks according to the previously described pseudo-butterfly addressing technique such that, at any one time, the two memory banks are being tested with complementary patterns--i.e., one memory bank contains a bit pattern composed predominantly of 1's and the other memory bank contains a bit pattern composed primarily of 0's.
Pseudo-Butterfly Stress Testing:
An important aspect of this invention in some preferred embodiments is that the process may be run on a multi-bank (interleaved) memory system. Interleaved memory systems are necessary to achieve the high memory bus data rates required by today's high performance computers.
An aspect of certain specific preferred embodiments of the present invention includes the use of a type of butterfly addressing technique. This memory test process simultaneously undertakes to perform a pattern sensitivity test that stresses paths to the nearest eight neighbors of a given memory cell (neighborhood of nine), as set forth previously, while simultaneously gaining the benefits of additional stress testing by alternately addressing multiple memory banks--one memory bank sequentially from the highest address to the lowest address while alternately accessing the other memory bank from the lowest address to the highest address. This aspect of the invention thus succeeds in combining the stress induced by previously known "butterfly addressing" sequences (but in a very different way) with a complex "neighborhood of nine" pattern sensitivity test.
The process implemented by the software program of Appendix A is optimized for a two bank memory system such as that shown in FIG. 1., wherein one of the memory banks, e.g. 12, is tested (i.e., addressed during the test) sequentially from the lowest address toward the highest address. The other of the two memory banks, e.g., 14, is tested from the highest address toward the lowest address.
It will be understood that, since the presently preferred embodiment duscussed herein is implemented in an interleaved dual memory bank memory system, the "pseudo-butterfly addressing" sequence discussed herein differs significantly from the butterfly addressing techniques discussed previously. In particular, wherein in prior techniques subsequent memory accesses alternated between low and high memory addresses and ultimately converged at a memory address in the middle of a single memory chip, the present addressing technique sweeps continuously from a high memory address to a low memory address of a single memory chip. However, from the point of view of an address driver, sequential memory accesses occur first to one memory bank and subsequently to another and different alternate interleaved memory bank. Increased stress on the memory driver components, however, is achieved notwithstanding a continuously increasing or decreasing memory address access to a single memory bank because sequential memory accesses occur between different memory banks and the different memory banks are alternately accessed from a high to low memory address and from a low to high memory address, respectively. Thus, address driver output approaches the maximum stress condition of a simultaneously switched outputs ("SSO").
In some systems pseudo-butterfly addressing will induce a considerable increase in overall noise in the power supply because of the increased switching which is induced in the output of the address driver 16 and data bus driver 21. In a worst case scenario pseudo-butterfly addressing may even cause noticeable "ground shift" in the power system. In any event, the results of the pattern sensitivity stress testing disclosed herein may be useful to test engineers in improving the design of the computer memory and power systems, as well as in locating single faults in memory chips.
Several preferred embodiments of the present invention have been described. Nevertheless, it will be understood that various modifications may be made without departing from the spirit and scope of the invention. For example, the invention may be utilized in interleaved computer memory systems having more than two interleaved memory banks. Alternatively, the method and apparatus of the present invention may be adapted for use in a computer system having a single memory bank. Furthermore, patterns other than those specifically disclosed as preferred in the application may be used to create stress in a computer memory subsystem so as to assist in the identification of existing faults and the identification of components likely to fail during normal use of the computer. Memory protected by an eror correcting code (ECC), although helpful, is not an essential component of the inventions described herein. The pseudo-butterfly memory accessing technique described herein may be used during some or all memory access passes through memory in an interleaved computer memory system. Thus, the present invention is not limited to the preferred embodiments described herein, but may be altered in a variety of ways which will be apparent to persons skilled in the art.
Appendix A--"C Language Source Code Listing For Pattern Sensitivity Testing"
Appendix B--"Pseudo-code For Pattern Sensitivity Testing Process" ##SPC1##