Question 1

Which is worse — a Type I or Type II error?

Accepted Answer

Neither is inherently worse — it depends on the consequences in your specific situation. In drug safety testing: a Type I error (approving a harmful drug) could kill patients — far worse than a Type II error (rejecting a drug that would have helped). In cancer screening: a Type II error (missing a real tumour) may be worse than a Type I error (an unnecessary biopsy from a false positive). The decision to set α (which controls Type I errors) and to specify power (which controls Type II errors) should be driven by the relative costs of each mistake in your domain.

Question 2

How do you calculate statistical power?

Accepted Answer

For a one-sample t-test: Power = P(|t| > t* | effect size = δ), where δ = (μ₁ − μ₀)/σ (Cohen's d). This calculation uses non-central t-distributions and is typically done with software. Inputs needed: (1) effect size (how large is the real effect?), (2) sample size n, (3) significance level α, (4) one-tailed or two-tailed. Online power calculators (G*Power is common) take these inputs and return power, or solve for the required n given a target power.

Question 3

What is a Type III error?

Accepted Answer

A Type III error (not part of the classical framework but commonly discussed) means correctly rejecting H₀ but for the wrong reason — or correctly detecting a significant difference but misidentifying which direction it goes. For example, a drug significantly changes blood pressure, but you conclude it lowers it when it actually raises it. In directional hypothesis tests, a Type III error occurs when p < 0.05 but the effect is in the opposite direction from what you expected.

Question 4

What is the relationship between α and the confidence interval?

Accepted Answer

The significance level α and the confidence interval (CI) are two sides of the same coin. A result is significant at α if and only if the (1−α)×100% CI does not contain the null value. At α = 0.05: reject H₀ if the 95% CI excludes the null value. At α = 0.01: reject H₀ if the 99% CI excludes the null value. Using this relationship, you can read off statistical significance directly from a CI — and get the additional information of effect size and precision that a bare p-value does not provide.

Question 5

What sample size do I need to achieve 80% power?

Accepted Answer

This depends on effect size and α. For a two-sample t-test with equal n, α = 0.05, and Cohen's d: d = 0.2 (small effect) → n ≈ 394 per group; d = 0.5 (medium effect) → n ≈ 64 per group; d = 0.8 (large effect) → n ≈ 26 per group. For a one-sample t-test at the same settings, divide these by roughly 2. These numbers illustrate why detecting small effects requires very large samples — and why many underpowered studies (with n = 20–30) only find large effects.

Property	Type I Error (α)	Type II Error (β)
Also called	False positive	False negative
What it means	Reject H₀ when H₀ is true	Fail to reject H₀ when H₀ is false
Probability	α (significance level, typically 0.05)	β (typically set at 0.10–0.20)
Statistical power	Lower α → harder to reject H₀ → lower power	Power = 1 − β (probability of detecting a real effect)
Controlled by	Choosing the significance level (α)	Sample size and effect size assumptions
Effect of larger n	Does not change (α is fixed by choice)	β decreases (power increases) with larger n
Consequence in medicine	Approving an ineffective drug	Missing an effective treatment
Consequence in testing	Failing a passing student	Passing a failing student
Conservative threshold helps	Reducing α reduces Type I errors	Reducing α increases Type II errors
Preferred balance	Depends on costs of each error in context	High power (1−β ≥ 0.80) is convention

Power (1−β)	β (Type II error rate)	Interpretation
0.99	0.01	Excellent — 99% chance of detecting the effect
0.90	0.10	Very good — used in high-stakes research
0.80	0.20	Conventional standard — adequate for most research
0.70	0.30	Low — misses 30% of real effects
0.50	0.50	Coin flip — underpowered study
< 0.50	> 0.50	Poorly powered — likely to miss real effects

Type I vs Type II Error — False Positives vs False Negatives

Side-by-Side Comparison

Type I Error (False Positive) — Explained

Type II Error (False Negative) — Explained

The Alpha–Beta Trade-Off

Statistical Power = 1 − β

Multiple Comparisons and Type I Error Inflation

Summary

Related Calculators

Frequently Asked Questions