Question 1

What does the p-value actually mean?

Accepted Answer

The p-value is the probability of getting a test result as extreme as observed, assuming H₀ is true. It is NOT: the probability that H₀ is true; the probability you made an error; or the effect size. p=0.02 means: if H₀ were true, seeing this result by chance would happen only 2% of the time — evidence against H₀. p=0.40 means results this extreme are common under H₀ — no evidence against it.

Question 2

How do I interpret p < 0.05?

Accepted Answer

p < 0.05 means: at the 5% significance level, you reject H₀. This does not mean the result is practically important — a tiny effect with large n can produce p < 0.05 with no real-world significance. Always pair p-values with an effect size (Cohen's d, r, relative risk). The 0.05 threshold is a convention, not a natural law. Report exact p-values rather than just 'p < 0.05'.

Question 3

What is the difference between one-tailed and two-tailed p-values?

Accepted Answer

Two-tailed tests whether the parameter differs from H₀ in either direction. One-tailed tests a specific direction. Two-tailed p = 2 × one-tailed p. Example: z=2.0 → two-tailed p=0.0455, one-tailed (right) p=0.0228. Two-tailed is more conservative and appropriate when you have no a priori directional hypothesis. Use one-tailed only if you committed to a direction before data collection.

Question 4

Which test statistic should I use: z, t, or chi-square?

Accepted Answer

z-test: use when testing a mean with known σ or for large samples (n ≥ 30) using the CLT approximation. t-test: use when testing means with unknown σ (the usual case in practice); requires degrees of freedom df = n−1. Chi-square: use for categorical data — goodness-of-fit (does data match expected distribution?) or independence (are two categorical variables associated?).

Question 5

What p-value is considered statistically significant?

Accepted Answer

Conventional thresholds: p < 0.05 → reject H₀ at 5% level (standard in social sciences). p < 0.01 → strong evidence against H₀. p < 0.001 → very strong evidence. Physics uses p < 0.0000003 (5σ) before claiming a discovery. Important: significance is binary (reject/not reject), but two tests with p=0.049 and p=0.051 provide essentially the same evidence — treat the threshold as a guideline, not a bright line.

Question 6

Can I calculate chi-square p-values here?

Accepted Answer

Yes. Enter the χ² statistic and the degrees of freedom: df = (rows−1)×(cols−1) for contingency tables, or k−1 for goodness-of-fit with k categories. Chi-square p-values are always right-tailed (one-tailed only — you only reject for large χ² values). Example: 2×3 contingency table, df=2, χ²=8.5 → p≈0.014 → significant at α=0.05.

P-value range	Evidence strength	Typical decision	Common in
p < 0.001	Very strong against H₀	Reject H₀	High-stakes clinical trials
0.001 ≤ p < 0.01	Strong against H₀	Reject H₀	Medical research, quality control
0.01 ≤ p < 0.05	Moderate against H₀	Reject H₀ (α=0.05)	Academic research, A/B testing
0.05 ≤ p < 0.10	Marginal — weak evidence	Fail to reject (standard)	Exploratory studies, pilot data
0.10 ≤ p < 0.50	Little evidence against H₀	Fail to reject H₀	Null results, replications
p ≥ 0.50	No evidence against H₀	Strong fail to reject	Confirming no difference

Use case	Statistic	Key condition
Testing mean, population σ known	z	σ known (rare in practice)
Testing mean, σ unknown (typical)	t (with df = n−1)	Normal data or n ≥ 30
Testing two means (A/B test)	Welch t (df from Satterthwaite)	Independent samples
Goodness-of-fit (observed vs expected)	χ² (df = k−1)	Expected counts ≥ 5
Independence of two categorical vars	χ² (df = (r-1)(c-1))	Expected counts ≥ 5 per cell
Proportion test (large n)	z (normal approx)	np̂ ≥ 5 and n(1−p̂) ≥ 5

P-Value Calculator

Formula

P-Value Interpretation Guide

Which Test Statistic Should I Use?

Case Study: Email Subject Line A/B Test

Related Statistics Tools

Related Calculators

Disclaimer

Frequently Asked Questions