US 20050075832 A1
System and method of continuously analyzing trial data of an ongoing clinical trial is provided. A statistical analysis is performed on a trial database containing subject trial data. If the result of the statistical analysis does not exceed a predetermined threshold value, then the statistical analysis is repeated while the clinical trial is ongoing. In a blinded clinical trial, a grouped database is generated from the trial database and a blinding database prior to performing the statistical analysis. The grouped database groups the subject trial data according to the study groups. The ability to continuously monitor and analyze the trial data for statistical significance in tandem with data collection while the trial is ongoing provides many benefits to the researchers because the trial database no longer becomes the bottleneck in obtaining useful results and statistical analysis can be conducted on a near real-time basis without having to wait until completion of the trial.
1. A method of continuously analyzing trial data of an ongoing clinical trial, the method comprising:
accessing a trial database containing trial data of subjects in a clinical trial;
performing a statistical analysis on the accessed trial database;
determining whether the result of the statistical analysis exceeds a predetermined threshold value; and
if it is determined that the result of the statistical analysis does not exceed the predetermined threshold value, then repeating the steps of accessing, performing and determining while the clinical trial is ongoing.
2. The method according to
reading a user defined criteria that defines the level of cleanliness of the trial data for statistical analysis; and
retrieving only those trial data that meet the user defined criteria from the trial database.
3. The method according to
4. The method according to
accessing a blinding database containing subject identifiers and associated study group identifiers, each study group identifier identifying to which study group an associated subject belongs; and
producing a grouped database from the clinical database and the blinding database for statistical analysis, the grouped database grouping the study data according to the study group.
5. The method according to
6. The method according to
7. The method according to
reading a predefined criteria that defines the level of cleanliness of trial data required for analysis;
retrieving only those trial data that meet the predefined criteria from the trial database;
accessing a blinding database containing subject identifiers and an associated study group identifier for each subject, each study group identifier identifying to which study group each subject belongs; and
producing a grouped database from the retrieved trial data and the blinding database for statistical analysis, the grouped database grouping the trial data according to the study group.
8. The method according to
9. The method according to
10. The method according to
11. The method according to
retrieving a user defined statistical model; and
running the retrieved user defined statistical model on the trial database.
12. A method of continuously analyzing trial data of an ongoing blinded clinical trial, the method comprising:
accessing a trial database containing blinded trial data of subjects in an ongoing blinded clinical trial;
accessing a blinding database containing subject identifiers and associated study group identifiers, each study group identifier identifying to which study group an associated subject belongs;
producing a grouped database from the trial database and the blinding database, the grouped database grouping the trial data according to the study group;
performing a statistical analysis on the produced grouped database;
determining whether the result of the statistical analysis exceeds a predetermined threshold value; and
if it is determined that the result of the statistical analysis does not exceed the predetermined threshold value, then repeating the above steps of:
accessing a trial database,
producing a grouped database,
performing a statistical analysis, and determining
while the clinical trial is ongoing.
13. The method according to
reading a user defined criteria that defines the level of cleanliness of trial data for statistical analysis; and
retrieving only those trial data that meet the user defined criteria from the trial database for statistical analysis.
14. The method according to
15. The method according to
16. The method according to
17. The method according to
18. A system for continuously analyzing an ongoing clinical trial comprising:
a storage device operable to store a trial database containing trial data of subjects in an ongoing clinical trial;
a processor coupled to the storage device; and
an analysis program executable by the processor and operable to:
perform a statistical analysis on the trial database;
determine whether the output result of the statistical analysis exceeds a predetermined threshold value; and
repeat the statistical analysis while the clinical trial is ongoing if it is determined that the result of the statistical analysis does not exceed the predetermined threshold value.
19. The system according to
read a user defined criteria that defines the level of cleanliness of trial data for statistical analysis; and
retrieve only those trial data that meet the user defined criteria from the trial database.
20. The system according to
21. The system according to
access a blinding database containing subject identifiers and associated study group identifiers, each study group identifier identifying to which study group an associated subject belongs; and
produce a grouped database from the trial database and the blinding database for statistical analysis, the grouped database grouping the trial data according to the study group.
22. The system according to
23. The system according to
24. The system according to
This application relates to data processing of clinical trial data and more specifically a system and method for statistically analyzing the clinical trial data.
In the United States, the Food and Drug Administration (FDA) oversees the protection of consumers exposed to health-related products ranging from food, cosmetics, drugs, gene therapies and medical devices. Under the FDA guidance, clinical trials are performed to test the safety and efficacy of new drugs, medical devices or other treatments to ultimately ascertain whether or not a new medical therapy is appropriate for widespread human consumption.
More specifically, once a new drug or medical device has undergone studies in animals, and results appear favorable, it can be studied in humans. Before human testing is begun, findings of animal studies are reported to the FDA to obtain approval to do so. This report to the FDA is called an application for an Investigational New Drug (IND).
The process of experimentation is referred to as a clinical trial, which involves four phases. In Phase I, a few research participants, referred to as subjects, (approximately 5 to 10) are used to determine toxicity of a new treatment. In Phase II, more subjects (10-20) are used to determine efficacy and further ascertain safety. Doses are stratified to try to gain information about the optimal portion. A treatment may be compared to either a placebo or another existing therapy. In Phase III, efficacy is determined. For this phase, more subjects on the order of hundreds to thousands of patients are needed to perform a meaningful statistical analysis. A treatment may be compared to either a placebo or another existing therapy. In Phase IV (post-approval study), the treatment has already been approved by the FDA, but more testing is performed to evaluate long-term effects and to evaluate other indications.
During clinical trials, patients are seen at medical clinics and asked to participate in a clinical research project by their doctor, known as an investigator. After the patients sign an informed consent form, they are considered enrolled in the study, and are subsequently referred to as study subjects. A study sponsor, generally considered to be the company developing a new medical treatment and supporting the research, develops a study protocol. The study protocol is a document describing the reason for the experiment, the rationale for the number of subjects required, the methods used to study the subjects, and any other guidelines or rules for how the study is to be conducted. Prior to usage, the study protocol is reviewed and approved by an Institutional Review Board (IRB). An IRB serves as a peer review group, which evaluates a protocol to determine its scientific soundness and ethics for the protection of the subjects and investigator.
Creation of Study Groups (Study Arms)
Subjects enrolled in a clinical study are stratified into groups that allow data to be assessed in a comparative fashion. In a common example, one study arm, known as a control group (or “control”), will use a placebo, whereby a pill containing no active chemical ingredient is administered. In doing so, comparisons can be made between subjects receiving actual medication versus placebo.
Subjects enrolled into a clinical study are assigned to a study arm in a random fashion, which is done to avoid biases that may occur in the selection of subjects for a trial. For example, a subject who is a particularly good candidate to respond to a new medication might be intentionally entered into the study arm to receive real medication and not a placebo. This could skew the data and outcome of the clinical trial to favor the medication under study, by the selection of subjects who are most likely to perform well with the medication. In instances where only one study group is present, randomization is not performed.
Blinding is a process by which the study arm assignment for subjects in a clinical trial is not revealed to the subject (single blind) or to both the subject and the investigator (double blind). This minimizes the risk of data bias. Virtually all randomized trials are blinded by definition. In instances where only one study group is present, blinding is not performed.
Statistical Analysis of Trial Data
Generally, at the end of the trial, the database containing the completed trial data is shipped to a statistician for analysis. If particular occurrences, such as adverse events, are seen with an incidence that is greater in one group over another such that it exceeds the likelihood of pure chance alone, then it can be stated that statistical significance has been reached. Using statistical calculations, the comparative incidence of any given occurrence between groups can be described by a numeric value, referred to as a “p-value”. A p-value of 1.0 indicates that there is a 100% likelihood that an incident occurred as the result of chance alone. Conversely, a p-value of 0.0 indicates that there is a 0% likelihood that an incident occurred as a result of chance alone. Generally, values of p<0.05 are considered to be “statistically significant”, and values of p<0.01 are considered “highly statistically significant”.
In some clinical trials, multiple study arms, or even a control group, may not be utilized. In such cases, only a single study group exists with all subjects receiving the same treatment. This is typically performed when historical data about the medical treatment, or a competing treatment is already known from prior clinical trials, and may be utilized for the purpose of making comparisons.
The creation of study arms, randomization, and blinding are techniques that are used in most clinical trials where scientific rigor is of high importance. However, these methods lead to several challenges, since they prevent the clinical trial sponsor from tracking key information related to safety and efficacy.
Regarding safety, the objective of any clinical trial is to document the safety of a new treatment. However, in clinical trials where randomization is conducted between two or more study arms, this can be determined only as a result of analyzing and comparing the safety parameters of one study group to another. Unfortunately, because the study arm assignments are blinded, there is no way to separate out subjects and their data into corresponding groups for purposes of performing comparisons while the trial is being conducted. Since many clinical trials may last for time periods extending for years, it is conceivable to have a treatment toxicity go unnoticed for prolonged periods without intervention.
Regarding efficacy, any clinical trial seeking to document efficacy will incorporate key variables that are followed during the course of the trial to draw the desired conclusion. In addition, studies will define certain outcomes, or endpoints, at which point a study subject is considered to have completed the protocol. These parameters, including both key variables and study endpoints, cannot be analyzed by comparison between study arms while the subjects are randomized and blinded. This poses potential problems in ethics and statistical analysis.
When new medications or other health-related treatments are of superior efficacy to anything else, it is ethical to allow usage of the treatment for those in imminent need, even prior to final government approval. Conversely, when available, it is considered unethical to withhold such treatments. For example, if a medication were to be identified that eradicated the Human Immunodeficiency Virus (HIV), it would be unethical to allow diseased patients to continue suffering and even die of the illness, while the medication was being clinically tested for purposes of government approval. Ideally, in such situations, identification of effective treatments should occur early in the project. Under these circumstances, non-treatment arms (i.e., those taking placebos) could be construed as unethical and should be eliminated. At present, when clinical trials are randomized and blinded, identification of a particularly effective treatment may not be realized until the entire clinical trial is completed.
Another related problem is statistical power. By definition, statistical power refers to the probability of a test appropriately rejecting the null hypothesis, or the chance of an experiment's outcome being the result of chance alone. Clinical research protocols are engineered to prove a certain hypothesis about a medical treatment's safety and efficacy, and disprove the null hypothesis. To do so, statistical power is required, which can be achieved by obtaining a large enough sample size of subjects in each study arm. When too few subjects are enrolled into the study arms, there is the risk of the study not accruing enough subjects to enable the null hypothesis to be rejected, and thus not reaching statistical significance. Because clinical trials that are randomized are blinded, the actual number of subjects distributed throughout study arms is not defined until the end of the project. Although this maintains data collection integrity, there are inherent inefficiencies in the system, regardless of the outcome.
In a case where the study data reaches statistical significance, as accrual of subjects continues, and data is received, an optimal time to close a clinical study would be at the very moment when statistical significance is achieved. While that moment may arrive earlier in the course of a clinical trial, there is no way of knowing this, and therefore time and money are lost. Moreover, study subjects are enrolled above and beyond what is needed to reach the goals of the study, thus placing human subjects under experimentation unnecessarily.
In a case where the study data nearly reaches statistical significance, while the study data falls short of statistical significance, there is reason to believe that this is due to a shortage of enrollment in the study. Frequently, to develop more supportive data, clinical trials will be extended. These “extension studies”, however, can only begin after a full closure of the parent study, frequently requiring months to years before starting again.
In a case where the study data does not reach statistical significance, there is no trend toward significance, and there is little chance of reaching the desired conclusion. In that case, an optimal time to close a study is as early as possible once the conclusion can be established that the treatment under investigation does not work, and study data has little chance of reaching statistical significance (i.e., it is futile). In randomized and blinded clinical trials, this conclusion is difficult to arrive at until data analysis can be conducted. In these situations, time and money are lost. Moreover, an excess of human subjects are placed under study unnecessarily.
Data Safety Monitoring
To mitigate some of the risks related to the conduct of randomized and blinded clinical trials, a Data Safety Monitoring Board (DSMB) may be formed at the beginning of each protocol. In general, a DSMB is recommended for clinical trials that involve a potentially serious outcome (e.g., death, heart attack, etc.), are randomized and blinded, and extend for prolonged periods of time. In addition, a DSMB is required for trials that are sponsored by the United States government, namely, the National Institute of Health (NIH).
A DSMB generally consists of members who are domain experts in the field of study, such as physicians, as well as bio-statisticians. It is important that DSMB members be separate from personnel of the sponsor organization, and financial disclosure for all members is performed to minimize conflicts of interest. Prior to start of a clinical trial, standard operating procedures are established for the DSMB, including the frequency of meetings, initiation of interim analyses, conduct during interim analyses and criteria for discontinuation of the clinical trial. As it relates to the safety of study subjects, DSMB functions to examine trends of adverse occurrences rather than investigate specific reports, which are generally left to each IRB responsible for the activities of any given investigator.
A typical method of collecting and analyzing patient data is illustrated in the flow chart shown in
In addition to the software programs, block 18 may also involve research personnel known as monitors or Clinical Research Associates (CRA) who travel to the various research sites to perform source document verification (SDV) whereby the data in the database 38 is reconciled against individual patient charts to the degree required in the protocol.
If it is determined that the data entered is not clean, then block 22 generates a query which is then sent over the link 20 to the CRC 12. The blocks 14, 18 and 22 are repeated until all of the subject data 10 are entered. This is an iterative process that continues until resolution of all queries in the database 38.
Once all data 10 are entered, block 24 determines whether the clinical trial is over. If no, then the EDC system continues to receive the patient trial data 10 through block 14 as the trial continues. If the trial is over, control passes to block 26 where the entire database is locked from any changes, deletions or insertions of the data in the database 38. In one embodiment, locking involves turning the database 38 into a “read-only” state.
In block 28, a blinding data from a blinding database is retrieved. A simplified example blinding database 40 is shown in
A simplified example trial database 38 is shown in
In block 28, an unblinded database is produced from the trial database 38 and the retrieved blinding database 40 in which the subject ID is used as a common key. The result of the unblinding process of block 28 is shown in
In block 30, statistical analysis is performed on the unblinded data 42 to find out the efficacy and safety of the completed clinical trial.
During the course of any given randomized and blinded clinical trial, an interim analysis may be conducted. An interim analysis may result from urging of the DSMB for cause, or be a pre-planned event as described in the study protocol.
Conducting an interim analysis involves a process where the available data is verified and cleaned. The verification process generally involves a process by which trained personnel travel to the various research sites to reconcile submitted data against source documents, which generally implies the patient's chart, laboratory reports, radiographic readings, and others. The data cleaning process may involve a series of documented communications between the research site and a central data coordinating personnel to resolve inconsistencies or other conflicting data.
The refined database must then be sent to an impartial third party for statistical analysis. To conduct the analysis, the statistician must un-blind the clinical trial database by combining both the study data with the blinding key of which subjects are assigned to particular study arms. Since the clinical study is expected to continue beyond the interim analysis, the process of un-blinding must be conducted with great caution, so as not to reveal the blind status of subjects to any personnel involved in the execution of the clinical trial. Once a statistician has completed the interim analysis, a report is issued to the trial sponsor and DSMB.
Inclusive of the data cleaning, verification, un-blinding and statistical analysis processes, as well as the administrative resources for coordinating several groups of personnel for the un-blinding process, an interim analysis is often arduous, time-consuming and expensive.
In spite of the latest technological advancements made in the area of data collection through electronic systems, there is still a disadvantage in that it is very difficult to draw conclusions about a medical treatment while the data is being collected during the trial. This limitation stems primarily from the fact that statistical analysis cannot begin until the trial data has been fully cleaned and processed. At present, statistical analysis can only be conducted upon data in an “en bloc” fashion. This creates a situation where the ability to draw conclusions about a medical therapy inevitably lags behind the process of simply obtaining data in a database.
Regardless of how efficient the data collection process may be made through automation, the ability to acquire the information needed for critical decision-making is still suspended by the requirement to obtain a locked database in order for statistical work to advance.
Therefore, it is desirable to provide a method and system for conducting statistical analysis on the clinical data collected while the trial is ongoing.
In the case of a randomized clinical trial where maintaining confidentiality is important, it is also desirable to provide a secure system in which the blinding information is integrated in such a way that the clinical trial data and blinding data are stored securely to prevent users from accessing the data and yet allow the execution of programs for performing statistical comparisons between study arms while the trial is ongoing.
According to the present invention, a system and method of continuously analyzing trial data of an ongoing clinical trial is provided. A trial database containing subject trial data in a clinical trial is accessed, and a statistical analysis is performed on the accessed trial database. If the result of the statistical analysis does not exceed a predetermined threshold value, then the step of statistical analysis is repeated while the clinical trial is ongoing.
In another aspect of the invention, the present method uses a user definable criteria that defines the level of cleanliness of subject data for statistical analysis. In that case, only those subject data that meet the user defined criteria are selected from the trial database for statistical analysis.
In another aspect of the invention, when the result of the statistical analysis does not exceed the predetermined threshold value, then the analysis program waits for a predetermined time period prior to repeating the statistical analysis step. This is done so that additional subject data are added to the trial database.
In another aspect, the clinical trial is blinded. Accordingly, in addition to the trial database, a blinding database containing subject identifiers and associated study group identifiers is accessed. Each study group identifier identifies which study group an associated subject belongs to. Then a grouped database is produced from the clinical database and the blinding database for statistical analysis in which the grouped database groups trial data according to the study group the subjects belong to. Preferably, one data table is created for each study group and contains all trial data for those subjects that belong to that study group.
In yet another aspect of the search, the unblinded database is stored in a memory device that is inaccessible by any user in order to preserve the blindness of the trial.
In another aspect of the search, the statistical analysis is performed without locking the trial database.
In another aspect of the search, if the result of the statistical analysis exceeds the predetermined threshold value, a user is alerted. The predetermined threshold value may include a predetermined statistical significance value.
In another aspect of the search, there are many statistical models to choose from. A user selectable statistical model is retrieved and the retrieved model is run on the trial database.
As shown in
The system 100 can be any computer such as a WINDOWS-based or UNIX-based personal computer, server, workstation, minicomputer or a mainframe, or a combination thereof. While the system 100 is illustrated as a single computer unit for purposes of clarity, persons of ordinary skill in the art will appreciate that the system may comprise a group of computers which can be scaled depending on the processing load and database size.
In block 52, the routine 50 connects to a trial database 56 through a log-in procedure. A simplified trial database 56 is shown in
In the “Heart Attack” field, an entry of 0 means NO and entry of 1 means YES. The “Heart Attack” field also includes some erroneous data such as “don't know” for subject 118 or “Y” for subject 107. Accordingly, the status for those records indicates a “1” in which queries are outstanding.
Once connected, the routine 50 retrieves in block 60 a user specified criteria 54 stored in the storage device 108 which specifies the status or level of cleanliness of the trial database and in block 61 retrieves the trial database 56 which is filtered for those database records that satisfy the retrieved criteria. For an example, if the retrieved user specified criteria is 3, block 61 selects only those records that have a status of 3 or better. Such a filtered database 58 is shown in
Once the database 39 is filtered according to the user specified criteria, block 62 is executed. In block 62, the blinding database 40 is retrieved in the memory 104. In block 64, the filtered trial database 58 and the blinding database 40 are used to produce a grouped database 42. In the embodiment shown, two database tables 66, 68, one for each study group without identifying subjects, are produced. One table 66 groups the Heart Attack data of subjects that belong to a control group (Study Arm A) while the other table 68 groups the Heart Attack data of subjects that belong to a non-control group (Study Arm B). As can be appreciated by person of ordinary skill in the art, there is no way to trace the origins of any given data point in either table 66 or table 68, to its original subject, and therefore either table, by itself, is relatively uninformative. Taken together, however, note that there seems to be a lot more heart attacks occurring in Study Arm B.
In the embodiment shown in
In block 70, the routine retrieves a user defined analysis method 72 stored in the storage device 108 and retrieves the method from the mathematical models 116 stored in the storage device. The model is then run to analyze the grouped database 42. Preferably, a statistical significance of the safety and efficacy of the unblinded database known as p-value is obtained. The mathematical model may include one or more formulas, representing mathematical calculations, whereby one or more variables in the clinical trial database are identified, and numeric result may be obtained. Such formulas might include calculations of: mean, median, mode, range, average deviation, standard deviation, and variance. In addition, an administrator may enter mathematical formulas to further analyze the data to make comparisons between groups of data, as defined by the study arms, to determine statistical metrics and significance by methods including Chi-square analysis, t-test, f-test, one-tailed test, two-tailed test, and Analysis of Variance (ANOVA).
Once the mathematical analysis is completed, a user-defined p-value 74 stored in the storage device 108 is retrieved in block 76. In block 78, it is determined whether the derived p-value exceeds the retrieved user defined p-value. As discussed in detail previously, a typical user defined p-value used may be 0.05 meaning that the difference between the control group and non-control group is statistically significant. Thus, if the derived value is less than 0.05, the decision in block 78 is YES. Then, the routine send an alert in block 80 without displaying the actual output value(s). The alert can be in the form of a flashing display, alarm, a change in the system output display to the user by virtue of color-coding, fonts, icons or text, or an automated system generated message to the user by way of email, facsimile, telephone or pager.
In block 82, the routine, as an option according to a user defined output mode 84, can also create other outputs such as the generic data tables 66, 68 created in block 64. The output data could take various formats including plain text, American Standard Code for Information Interchange (ASCII), and SAS. Where appropriate, this would allow for more customized statistical analysis to be performed. These outputs may also be integrated with other software packages for creation of customized graphical reports.
If the trial is a randomized clinical trial, it is preferable to execute only block 80 which provides a Boolean output as to whether or not a particular study parameter has reached the desired level of statistical significance or not. Block 82 in that case is then skipped. The benefit of such a mode is to maintain the blinding information as securely as possible, and minimize the ability for inference to be made about the study arm of any given subject. In monitoring the exact numeric determination of statistical significance for any given clinical trial variable, it is conceivable that the accession of new data could cause statistical metrics for a particular study arm to change in such a manner that inference could be made regarding the blinding status of the subject whose data was most recently added, thus compromising statistical veil.
Block 80 may be useful in non-randomized trials because there is a benefit to display the specific numeric value corresponding to statistical significance, and since there is no blinding information to protect, it would be offered as a second mode of operation in the system. Alternatively, a third mode could be provided, whereby numeric ranges of statistical significance could be defined into groups that would be output to the user of the system.
If, however, the p-value derived is higher than the user-defined p-value, then the derived value does not exceed the user defined threshold value. In that case, the decision is NO and the routine 50 executes block 86. In block 86, the routine 50 waits for a predetermined amount of time and control passes to block 52 where the process of analyzing the trial data while the trial is ongoing is repeated. In other words, the system 100 is active throughout the data collection phase of the clinical trial, sending alerts when key parameters reach the pre-set level of statistical measure.
As can be appreciated by persons of ordinary skill in the art, the ability of the present clinical trial system 100 to continuously and confidentially monitor and analyze the trial data for statistical significance in tandem with data collection while the trial is ongoing is a tremendous benefit to the researchers. The trial database no longer becomes the bottleneck in obtaining useful results and statistical analysis can be conducted on a near real-time basis.
This continuous near real-time statistical analysis feature in turn has far reaching implications. Specifically, by providing researchers with an early indication of the clinical trial, the present invention shortens the time frame required to reach critical decisions about a new medical therapy. Still another advantage is that the present system improves patient safety by setting thresholds for triggering alerts for adverse events. A related advantage is that a futile trial can be ended early, thereby saving the substantial cost of conducting the trial. Conversely, for a successful medical treatment, a trial can be ended early or the placebo arm can be eliminated. The present invention also provides the ability to more accurately identify the need to perform a full-scale interim analysis.
Various omissions, modifications, substitutions and changes in the forms and details of the device illustrated and in its operation can be made by those skilled in the art without departing in any way from the spirit of the present invention. Accordingly, the scope of the invention is not limited to the foregoing specification, but instead is given by the appended claims along with their full range of equivalents.