0
We're unable to sign you in at this time. Please try again in a few minutes.
Retry
We were able to sign you in, but your subscription(s) could not be found. Please try again in a few minutes.
Retry
There may be a problem with your account. Please contact the AMA Service Center to resolve this issue.
Contact the AMA Service Center:
Telephone: 1 (800) 262-2350 or 1 (312) 670-7827  *   Email: subscriptions@jamanetwork.com
Error Message ......
Original Article |

Negative Results of Randomized Clinical Trials Published in the Surgical Literature:  Equivalency or Error? FREE

Justin B. Dimick, MD; Marie Diener-West, PhD; Pamela A. Lipsett, MD
[+] Author Affiliations

From the Departments of Surgery (Drs Dimick and Lipsett) and Anesthesiology/Critical Care Medicine (Dr Lipsett), Johns Hopkins University School of Medicine; and the Department of Biostatistics (Dr Diener-West), Johns Hopkins University School of Hygiene and Public Health, Baltimore, Md.


Arch Surg. 2001;136(7):796-800. doi:10.1001/archsurg.136.7.796.
Text Size: A A A
Published online

Hypothesis  We hypothesized that review of randomized controlled clinical trials (RCTs) with nonstatistically significant or "negative" results published in the surgical literature do not have appropriate statistical power to demonstrate equivalency between treatment arms.

Data Sources and Study Selection  The MEDLINE database was searched to obtain reports of all RCTs with negative results published in 3 surgical journals from 1988 to 1998. Manual review of one year (1997) of publications for each journal was performed to validate our search strategy. Equivalency was evaluated using the Two One-Sided Tests Procedure and post hoc power calculations.

Data Synthesis  Ninety reports of RCTs with negative results were identified in the surgical literature between 1988 and 1998. The manual review of 1997 showed a 100% retrieval rate for our search strategy. After applying the Two One-Sided Tests Procedure, 35 reports (39%) met the criteria for demonstrating equivalency. The other 55 reports (61%) contained at least a 10% absolute difference in the 90% confidence interval of Δ. Using the power calculation method, only 22 (24%) articles had a power greater than .80 to detect a 50% difference in therapeutic effect. Only 29% of the reports included a formal sample size calculation and these studies were more likely to demonstrate equivalency than those without a sample size estimate (P<.01).

Conclusions  Many reports from negative RCTs published in the surgical literature lack sufficient statistical power to establish that clinically important differences are not present. Surgeons should perform appropriate sample size calculations when designing RCTs and recognize the utility of confidence intervals when reporting negative results.

Figures in this Article

CLINICAL decisions should be based on the critical appraisal of relevant literature coupled with the experience and judgment of the surgeon. The randomized controlled clinical trial (RCT) is the definitive method to investigate the relative efficacy of 2 or more interventions of interest. However, RCTs comprise only 3% to 7% of research publications in surgical journals.1,2,4 Previous efforts aimed at evaluating the quality of surgical RCTs have shown that many of them contain errors in methodology.1,3,5

When reporting the results of a clinical trial, investigators often state whether the results of a comparison between treatment groups demonstrate a statistically significant difference with respect to the primary outcome or end point. This statement refers to the P value, obtained after applying a statistical hypothesis test. If P is less than some predefined probability (usually .05), the 2 groups are considered statistically different. When P is greater than .05, it is concluded that differences between the groups may be explained by chance alone.

However, there are 2 types of errors owing to chance that may result during statistical hypothesis testing (Table 1). A type I error concludes, based on the P value, that there is a difference between the intervention and nonintervention groups when one does not exist. A type II error concludes that there is not a difference between the treatment groups when one may exist. The power of a study (β) is the probability of a statistically important difference between the 2 groups when such a difference exists. Therefore, in a trial in which the 2 therapeutic options seem the same, the underlying statistical power of the study to detect a true difference between the groups must be considered.6,7 Reporting P>.05 is not the same as demonstrating equivalency between 2 treatment options.

Table Graphic Jump LocationTable 1. Conclusions of a Statistical Hypothesis Test in Which the Null Hypothesis States That There Is No Difference in Outcome Between the Treatment Arms*

Because surgical interventions are complex and patient or physician preference may limit enrollment, the sample sizes of many RCTs in the field of surgery are small. Consequently, the trials may have inadequate statistical power to detect clinically important differences in therapeutic effects.810 This study was undertaken to estimate the prevalence of studies at risk for type II errors in the surgical specialty literature and to discuss the implications of our findings on study design, reporting, and interpretation of RCTs.

LITERATURE SEARCH

The search strategy was designed to yield a sample of RCTs published in the surgical specialty literature from which "negative" trials (those that concluded that there was no difference between the treatment arms) could be selected. The MEDLINE database was searched using the Medical Subject Headings (MeSH) clinical trials and RCTs, the keywords clinical trials and RCTs, and the publication types RCTs and controlled clinical trials. The search was limited to 3 surgery specialty journals (Annals of Surgery, Surgery, and Archives of Surgery) from January 1988 to December 1998.

The abstracts from all articles were reviewed; our analysis included all reports of RCTs that concluded there were equivalent dichotomous outcomes in the treatment arms. The statement regarding equivalency had to be explicit (for example, "there was no statistically significant difference between the groups") and had to refer to a statistical test with P>.05 for the outcome variable of interest. The outcome variable we chose was either clearly labeled as the primary end point or was the primary focus of the article. Articles were excluded for the following reasons: not representing original data (eg, meta-analysis, review articles, editorials, letters); nonrandomized treatment allocation; the use of animal subjects; retrospective data collection; and having a continuous variable as the primary end point. To document the adequacy of our literature search, a manual review of one year (1997) of publications of each journal was performed. Using this information we calculated the percent yield of our MEDLINE search strategy.

DATA ABSTRACTION

The full text of the included articles was systematically reviewed. Data were abstracted and recorded on a standardized form. Information was recorded regarding the type of intervention (surgical, pharmacological, adjuvant oncologic therapy, or other); author affiliations (surgery, anesthesia, medicine, biostatistics, or other); the presence of an a priori power calculation; the event rates in the 2 treatment arms; the number of subjects in each treatment arm; the presence of a post hoc power calculation; and the discussion of lack of power as a weakness. There was a single data abstractor who was responsible for primary review of the full text articles (J.B.D.). To assess the accuracy of our data abstraction, 20 articles were randomly chosen and a second author (P.A.L.) reviewed the full text, repeating the data abstraction. The percent agreement and the κ statistic (the percent agreement greater than that expected by chance alone) were calculated to assess interobserver variability for each outcome variable. Each study received a number and data analysis was blinded to the author and institution of the publication.

CONFIDENCE INTERVALS (CIs) AND EQUIVALENCY TESTING

The goal of an experiment evaluating a new therapy is to estimate the proportion of patients who achieve the outcome in the treatment group (PT) and in the control group (PC). Δ is the difference between the 2 groups. Because the entire population of similar patients is not studied, it was necessary to assess the precision with which our estimate was likely to represent the true difference between the groups. This was accomplished by providing a range of values based on the observed data that were consistent with the true value. The precision of the observed difference (Δ) between the treatment groups was best represented by a CI for the true population difference. To calculate the 90% CI for Δ, the standard error (SE) and upper and lower limits of a 90% CI were calculated using the sample size of the control (n1) and treatment (n2) groups and the following equations:

To determine which reports demonstrated true negative results (equivalency) we used the Two One-Sided Tests Procedure.11,12 First, a (1–2 × α) CI was constructed for the absolute difference between treatment groups for each study. Equivalency was concluded if the limits of the CI were entirely within a predetermined equivalency interval. For most studies, α was set at .05 and 90% CIs were calculated. We considered plus or minus 10% and 25% absolute differences to be clinically important. For instance, if the event rate was 10% in the experimental group and 20% in the control group, the absolute difference between the 2 groups (Δ) would be 10%, or a relative reduction of 50%. As is commonly the case in the literature, we considered these differences to be clinically important.

POWER CALCULATIONS

The power to detect a predefined effect size was calculated for each RCT. We chose to calculate the power needed to detect a difference (Δ) of 25% and 50% between the 2 groups given the baseline event rate. For each publication, the event rate in the PC group was determined and the proportion that represented a 25% [PC – (PC) (0.25)] and a 50% [PC – (PC) (0.5)] reduction were calculated. Using the number of patients in each treatment arm and the proportions representing a 25% and 50% reduction in the baseline event rate, the power of each study was calculated setting the α at .05 (2-tailed). Publications reporting an event rate of 0 in both treatment groups were excluded from the power analysis but were included in the assessment of other end points.

OUTCOME VARIABLES

The primary outcome of our investigation was whether the reported results of an RCT met the criteria for equivalency using the Two One-Sided Tests Procedure. One secondary outcome was the post hoc power calculation for each report. An unacceptable level of a type II error was defined as any post hoc power less than 80% (β>.2). Other secondary outcomes of this study were the presence of an a priori power calculation, presence of a post hoc power calculation, and discussion of lack of power as a weakness of the study. In addition, the reports were divided according to type of intervention (surgical procedure; adjuvant therapy [chemotherapy, external beam or intraoperative radiation, or immunologic tumor vaccine]; or other pharmacological agent), the journal of publication, and whether an a priori sample size calculation was reported. The χ2 test was used to test for associations between these study characteristics and failure to demonstrate equivalency. All statistical analyses were performed using STATA Version 6.0(Stata Corp, College Station, Tex).

The MEDLINE search strategy yielded 526 publications (Figure 1). After applying the initial exclusion criteria, 268 prospective RCTs using human subjects remained. Randomized clinical trials represented 3.2% of the total number of publications during the 11-year study period (1988-1998). These studies included reviews, case reports, and clinical and basic science articles. Therefore, the rate of clinical trials that were randomized is likely higher than this number.

Place holder to copy figure label and caption
Figure 1.

Results of the MEDLINE Search.

Graphic Jump Location

Of these abstracts, 136 (51%) had conclusions stating that the outcomes were equivalent in the treatment arms. Full text was obtained for further review and 8 additional articles were excluded because they failed to explicitly conclude equivalency between the treatment groups within the text. These 8 abstracts stated that the 2 treatments being compared may be equivalent but the body of the article suggested that there may be no difference between the outcomes, especially when more than 1 outcome is considered and the primary outcome is not evident. In addition, 32 articles that had a continuous rather than a dichotomous variable as the primary outcome were excluded. The remaining 96 RCTs reported dichotomous primary outcome variables that explicitly concluded equivalency between the treatment groups. The manual search of the target journals for 1997 demonstrated that our MEDLINE search strategy retrieved 26 (100%) RCTs published during that year.

Table 2 presents the percent agreement and κ statistics associated with interobserver variability in data abstraction for the 20 randomly selected articles. There was excellent agreement between observers in assessing the risk of a type II error (κ = 1.0 [100% agreement]) and the presence of an a priori sample size calculation (κ = 0.89 [95% agreement]). Both of these values were interpreted as having "almost perfect" agreement.13 The more subjective assessments, such as the presence of post hoc power calculation (κ = 0.59; 85% agreement) and discussion of lack of power as a limitation (κ = 0.61; 80% agreement), demonstrated more interobserver variability; these values were interpreted as having "moderate" and "substantial" agreement, respectively.13

Table Graphic Jump LocationTable 2. Interobserver Variability in Data Abstraction for 20 Randomly Selected Articles*

Table 3 presents the journals of publication and associated characteristics of the 96 articles included in the analysis. There seemed to be no increase in publication of RCTs during the 11-year period, with approximately equal numbers derived from each time interval. Most trials tested the efficacy of either a surgical procedure (n = 43) or a pharmacological agent (n = 42) and the remainder involved adjuvant cancer therapy (n = 6) or other (n = 5). Surgeons were the sole authors in 50 articles (52%) and they shared authorship predominantly with colleagues in the departments of medicine (n = 14), anesthesia (n = 8), and biostatistics (n = 6).

Table Graphic Jump LocationTable 3. Journal of Publication and Characteristics of Reports From Selected Randomized Clinical Trials

The Two One-Sided Tests Procedure was performed on 90 articles. Six articles were not included in the equivalency analysis because the event rate was 0 in both of the treatment groups. Of the included articles, 35 (39%) demonstrated equivalency (given an equivalency interval of ±10% absolute difference). The 90% CIs for the differences between treatment and control groups are shown in Figure 2. In the power analysis, none of the articles demonstrated an 80% or greater power to detect a 25% relative difference in the treatment groups and only 24% had a power of 80% or more to detect a 50% relative difference. Of the reports of RCTs that were at risk for type II errors, only 14 (19%) mentioned a small sample size or lack of power as a weakness. Furthermore, only 7 articles (9%) presented a post hoc power analysis, formally addressing the lack of power in their study. Twenty-eight trials (29%) included an explicit sample size calculation in the report and these trials were less likely to be at risk for a type II error (P<.01).

Place holder to copy figure label and caption
Figure 2.

Point estimates of the difference between treatment groups (Δ) with 90% confidence intervals for 90 negative randomized clinical trials.

Graphic Jump Location

This study documents that results of many reports of negative clinical trials published in the surgical literature lack precision or are at risk for a type II error and do not demonstrate equivalency. In other words, the reports may conclude there is no difference between intervention and control or placebo groups when one may exist. We used 2 approaches to assess the risk of a false conclusion and both demonstrated similar results. Specifically, using an estimation approach for equivalency testing, only 39% of reports satisfied the criteria for equivalence. Likewise, using a hypothesis testing approach, only 24% of the articles had a power greater than 80% to detect a 50% difference between the treatment arms. Thus, 61% of the RCTs were failed experiments in that the researchers failed to reach a conclusion, either of equivalence or dissimilarity. Such failures can be prevented by appropriate a priori power considerations. Furthermore, it was shown that many authors do not include a formal sample size calculation or discuss lack of power as a limitation. These findings have important implications on surgical decision making. If studies with inadequate statistical power or sample size fail to demonstrate benefit of a particular therapy, we may inappropriately label that therapy as ineffective and refrain from pursuing further research, effectively abandoning a potentially efficacious therapy.

In a landmark study published in 1978, Freiman et al14 conducted a survey of 71 negative RCTs published in the medical literature. They demonstrated that reports of many RCTs do not have adequate power to demonstrate clinically important differences in therapeutic effect. Of the 71 trials evaluated, 67 had less than a 90% power to detect a 25% therapeutic improvement and 50 had less than a 90% power to detect a 50% improvement.14 We chose to assess an 80% rather than a 90% power and consequently, our estimate of studies at risk for type II errors may be more conservative. Since 1978, similar studies have demonstrated that a large proportion of RCTs published in emergency medicine,15 hand surgery,13 and the Australian medical literature,16 have the same methodologic shortcomings.

The decision to treat a patient with a given intervention, whether it be a surgical procedure or a pharmacologic agent, should be based on the best available evidence from clinical trials coupled with the experience and judgment of the surgeon. In recent years, there has been increased emphasis on the RCT as the definitive method to evaluate the relative efficacy and toxicity of a therapeutic intervention. In an ideal RCT, the 2 treatment groups should have equal likelihood of achieving the outcome of interest independent of the intervention. Randomization, therefore, effectively eliminates much of the systematic error (otherwise known as bias) that plagues many other study designs.

Although bias is minimized in an RCT, it is important to consider errors introduced by chance in statistical decision making. The risk of making a type I error (α) is determined by the investigator and is often set at a level of 5% or less. Unlike the type I error, the probability of a type II error (β) is not set by the investigator; it is a function of α, the size of the study population, the frequency of the outcome of interest, and the magnitude of the difference between the 2 groups being studied (Δ). For our study, we choose an absolute difference of 10% and 25% (Two One-Sided Tests Procedure) and a relative difference of 50% for the power calculations because the magnitude of these differences is commonly considered to be clinically significant.11,12

A type II error relies, in part, on the results of the RCT and is therefore subject to unplanned variation. Furthermore, the type II error is not commonly addressed in the discussions of many reports of RCTs and may be an overlooked source of inaccurate interpretation of clinical trial results. There may be great consequence in concluding that 2 therapeutic options are the same when, in fact, they are clinically significantly different. Our study demonstrates that most RCTs published in surgical specialty journals are at risk for this type of error, assuming that the magnitude of difference between the treatment options is 50%. When designing clinical trials in the future, surgeons should calculate both a priori sample size and power in the early planning stages. Obtaining an estimate of the required sample size early in planning may help refine the research question. For instance, if the sample is prohibitively large, choosing a surrogate outcome variable with a higher frequency may allow a smaller sample. On the contrary, the trial may be judged as unfeasible before significant resources are needlessly spent. Only 29% of the reports of RCTs evaluated in our study included sample size estimates, but those doing so were less likely to be at risk for a type II error. Solomon et al1 report a similarly low percentage of reports of RCTs in surgery with adequate sample size determinations (19%3 and 11%, respectively).

Performing a sample size calculation is relatively straightforward and can be conducted using any statistical software package or tables available in most statistical texts.17 First, the expected frequency of the outcome variable in the nonintervention group must be estimated; this is usually taken from previously published research or, in some instances, from a pilot study. Second, the smallest difference that is felt to be clinically meaningful (Δ) between the 2 treatment arms must be chosen. Usually, a relative change of 25% to 50% from the baseline event rate is generally accepted. Third, the minimally acceptable probability of a type I or type II statistical error must be chosen. The generally accepted value of β is .20 or .10 (power of .80 to .90). However, in a trial specifically designed to demonstrate equivalency of 2 therapeutic options (in which a negative result is anticipated) a smaller β (or larger power) is desired since avoiding a type II error in this setting is particularly important.5,6 In our study, we set the power at .80, which is the minimally acceptable level, especially for trials that claim equivalency.

When surgeons obtain a negative result they should report the associated 90% or 95% CI of the difference in outcome between the groups (Δ). This is necessary to appropriately interpret the results and perceive the risk of a type II error. If the CI contains clinically significant differences in the outcome of interest, the use of the Two One-Sided Hypothesis Procedure for demonstrating equivalency will assist in understanding the differences between 2 treatment options. In this technique, the precision of the absolute difference (Δ) in the primary outcome variable between treatment groups is tested by creating a 90% CI. Before the calculation is obtained we had to estimate what we consider a clinically significant difference between the groups. In general, a 10% absolute difference (eg, 10% in group A and 20% in group B—a 50% relative difference) would be clinically significant. If this predefined difference is included within the 90% CI we can be relatively sure that the treatment options are equivalent. Otherwise, the trial fails to demonstrate equivalency and we have not given the intervention of interest a fair trial.

Our study has several limitations. We did not include all surgical research published during the representative period. Some RCTs are not published in the surgical literature, but appear in larger multidisciplinary journals. However, this study was designed to examine the surgical specialty literature and calculate the frequency of type II errors in those journals. Also, we used a search of a computerized database (MEDLINE) to locate articles running the risk of missing a certain proportion of RCTs. Previous authors have shown MEDLINE to retrieve less than half of RCTs published during a given period.1 To minimize this risk, we performed a manual search of the target journals for a 1-year period; this effort demonstrated a 100% retrieval rate for our search strategy, allowing us to be confident that the articles included represented most of the RCTs during the study period. An additional limitation comes from the assumptions made in performing the power calculations. We assigned the same Δ to each trial (50%). While this was necessary to perform the calculation, it is not ideal. The Δ that is clinically important is specific to each therapy and each population.

These findings have important implications on the future design and interpretation of RCTs in surgery. Surgeons should conduct a sample size calculation during the design phase of the study. In addition, they should report the CI of the difference between the treatment groups. If there is inadequate statistical power to detect clinically significant differences between treatment groups, this should be explicitly stated in the conclusion. Such practice may better inform the reader and promote further study of potentially efficacious therapies.

Corresponding author and reprints: Pamela A. Lipsett, MD, Department of Surgery, Johns Hopkins Hospital, 600 N Wolfe St, Blalock 685, Baltimore, MD 21287-4683 (e-mail: plipsett@jhmi.edu).

Solomon  MJLaxamana  ADevore  L  et al.  Randomized controlled trials in surgery. Surgery. 1994;115707- 712
Horton  R Surgical research or comic opera: questions, but few answers. Lancet. 1996;347984- 985
Link to Article
Hall  JCMills  BNguyen  H  et al.  Methodologic standards in surgical trials. Surgery. 1996;119466- 472
Link to Article
Solomon  MJMcLeod  RS Should we be performing more randomized controlled trials evaluating surgical operations? Surgery. 1995;118459- 467
Link to Article
Hall  JCPlatell  CHall  JL Surgery on trial: an account of clinical trials evaluating operations. Surgery. 1998;12422- 27
Link to Article
Hulley  SBCummings  SR Estimating sample size and power. Designing Clinical Research. Baltimore, Md Williams & Wilkins1998;139- 150
Lachin  JM Introduction to sample size determination and power analysis for clinical trials. Control Clin Trials. 1981;293- 114
Link to Article
Lawrence  W  Jr Some problems with trials. Arch Surg. 1991;126370- 378
Link to Article
Mcleod  RSWright  JGSolomon  MJ  et al.  Randomized controlled trials in surgery: issues and problems. Surgery. 1996;119483- 486
Link to Article
Bell  PRF Surgical research and randomised trials. Br J Surg. 1997;84737- 738
Link to Article
Schuirmann  DJ A comparison of the Two One-Sided Tests procedure and the power approach for assessing the equivalence of average bioavailability. J Pharmacokinet Biopharm. 1987;15657- 680
Link to Article
Kirshner  B Methodologic standards for assessing therapeutic equivalence. J Clin Epidemiol. 1991;44839- 849
Link to Article
Brown  CGKelen  GDAshton  JJ  et al.  The beta error and sample size determination in clinical trials in emergency medicine. Ann Emerg Med. 1987;16183- 187
Link to Article
Freiman  JAChalmers  TCSmith  H  et al.  The importance of beta, the Type II error and sample size in the design and interpretation of the randomized controlled trial. N Engl J Med. 1978;299690- 694
Link to Article
Chung  KCKalliainen  LKHayward  RA Type II (beta) errors in the hand literature: the importance of power. J Hand Surg. 1998;2320- 25
Link to Article
Hall  JC The other side of statistical significance: a review of Type II errors in the Australian medical literature. Aust NZ J Med. 1982;127- 9
Link to Article
Sackett  DLHaynes  BRGuyatt  GH Clinical Epidemiology: A Basic Science for Clinical Medicine. 2nd ed. New York, NY Little Brown & Co1991;

Figures

Place holder to copy figure label and caption
Figure 1.

Results of the MEDLINE Search.

Graphic Jump Location
Place holder to copy figure label and caption
Figure 2.

Point estimates of the difference between treatment groups (Δ) with 90% confidence intervals for 90 negative randomized clinical trials.

Graphic Jump Location

Tables

Table Graphic Jump LocationTable 1. Conclusions of a Statistical Hypothesis Test in Which the Null Hypothesis States That There Is No Difference in Outcome Between the Treatment Arms*
Table Graphic Jump LocationTable 2. Interobserver Variability in Data Abstraction for 20 Randomly Selected Articles*
Table Graphic Jump LocationTable 3. Journal of Publication and Characteristics of Reports From Selected Randomized Clinical Trials

References

Solomon  MJLaxamana  ADevore  L  et al.  Randomized controlled trials in surgery. Surgery. 1994;115707- 712
Horton  R Surgical research or comic opera: questions, but few answers. Lancet. 1996;347984- 985
Link to Article
Hall  JCMills  BNguyen  H  et al.  Methodologic standards in surgical trials. Surgery. 1996;119466- 472
Link to Article
Solomon  MJMcLeod  RS Should we be performing more randomized controlled trials evaluating surgical operations? Surgery. 1995;118459- 467
Link to Article
Hall  JCPlatell  CHall  JL Surgery on trial: an account of clinical trials evaluating operations. Surgery. 1998;12422- 27
Link to Article
Hulley  SBCummings  SR Estimating sample size and power. Designing Clinical Research. Baltimore, Md Williams & Wilkins1998;139- 150
Lachin  JM Introduction to sample size determination and power analysis for clinical trials. Control Clin Trials. 1981;293- 114
Link to Article
Lawrence  W  Jr Some problems with trials. Arch Surg. 1991;126370- 378
Link to Article
Mcleod  RSWright  JGSolomon  MJ  et al.  Randomized controlled trials in surgery: issues and problems. Surgery. 1996;119483- 486
Link to Article
Bell  PRF Surgical research and randomised trials. Br J Surg. 1997;84737- 738
Link to Article
Schuirmann  DJ A comparison of the Two One-Sided Tests procedure and the power approach for assessing the equivalence of average bioavailability. J Pharmacokinet Biopharm. 1987;15657- 680
Link to Article
Kirshner  B Methodologic standards for assessing therapeutic equivalence. J Clin Epidemiol. 1991;44839- 849
Link to Article
Brown  CGKelen  GDAshton  JJ  et al.  The beta error and sample size determination in clinical trials in emergency medicine. Ann Emerg Med. 1987;16183- 187
Link to Article
Freiman  JAChalmers  TCSmith  H  et al.  The importance of beta, the Type II error and sample size in the design and interpretation of the randomized controlled trial. N Engl J Med. 1978;299690- 694
Link to Article
Chung  KCKalliainen  LKHayward  RA Type II (beta) errors in the hand literature: the importance of power. J Hand Surg. 1998;2320- 25
Link to Article
Hall  JC The other side of statistical significance: a review of Type II errors in the Australian medical literature. Aust NZ J Med. 1982;127- 9
Link to Article
Sackett  DLHaynes  BRGuyatt  GH Clinical Epidemiology: A Basic Science for Clinical Medicine. 2nd ed. New York, NY Little Brown & Co1991;

Correspondence

CME
Meets CME requirements for:
Browse CME for all U.S. States
Accreditation Information
The American Medical Association is accredited by the Accreditation Council for Continuing Medical Education to provide continuing medical education for physicians. The AMA designates this journal-based CME activity for a maximum of 1 AMA PRA Category 1 CreditTM per course. Physicians should claim only the credit commensurate with the extent of their participation in the activity. Physicians who complete the CME course and score at least 80% correct on the quiz are eligible for AMA PRA Category 1 CreditTM.
Note: You must get at least of the answers correct to pass this quiz.
You have not filled in all the answers to complete this quiz
The following questions were not answered:
Sorry, you have unsuccessfully completed this CME quiz with a score of
The following questions were not answered correctly:
Commitment to Change (optional):
Indicate what change(s) you will implement in your practice, if any, based on this CME course.
Your quiz results:
The filled radio buttons indicate your responses. The preferred responses are highlighted
For CME Course: A Proposed Model for Initial Assessment and Management of Acute Heart Failure Syndromes
Indicate what changes(s) you will implement in your practice, if any, based on this CME course.
Submit a Comment

Multimedia

Some tools below are only available to our subscribers or users with an online account.

Web of Science® Times Cited: 43

Related Content

Customize your page view by dragging & repositioning the boxes below.

Articles Related By Topic
Related Collections
PubMed Articles
JAMAevidence.com

Users' Guides to the Medical Literature
Type I and Type II Errors


Type II error