P-Value Calculator

Reviewed by CalcMulti Editorial Team·Last updated: ·Statistics Hub

The p-value is the probability of observing a test statistic as extreme as yours — or more extreme — assuming the null hypothesis is true. A small p-value provides evidence against H₀; a large one fails to reject it. This calculator computes exact p-values from z-scores, t-statistics, and chi-square statistics.

Choose your test statistic type, enter the value (plus degrees of freedom for t or χ²), and select one-tailed or two-tailed to get the exact p-value with a significance interpretation at common α levels.

Formula

Two-tailed p = 2 × P(Z > |z|) | Chi-square p = P(χ² > χ²_obs | df)

z
z-statistic — for large samples or known population σ
t
t-statistic — for small samples with unknown σ; requires df
χ²
chi-square statistic — for categorical data tests; requires df
p-value
probability of observing this result or more extreme under H₀

P-Value Interpretation Guide

P-value rangeEvidence strengthTypical decisionCommon in
p < 0.001Very strong against H₀Reject H₀High-stakes clinical trials
0.001 ≤ p < 0.01Strong against H₀Reject H₀Medical research, quality control
0.01 ≤ p < 0.05Moderate against H₀Reject H₀ (α=0.05)Academic research, A/B testing
0.05 ≤ p < 0.10Marginal — weak evidenceFail to reject (standard)Exploratory studies, pilot data
0.10 ≤ p < 0.50Little evidence against H₀Fail to reject H₀Null results, replications
p ≥ 0.50No evidence against H₀Strong fail to rejectConfirming no difference

Which Test Statistic Should I Use?

Use caseStatisticKey condition
Testing mean, population σ knownzσ known (rare in practice)
Testing mean, σ unknown (typical)t (with df = n−1)Normal data or n ≥ 30
Testing two means (A/B test)Welch t (df from Satterthwaite)Independent samples
Goodness-of-fit (observed vs expected)χ² (df = k−1)Expected counts ≥ 5
Independence of two categorical varsχ² (df = (r-1)(c-1))Expected counts ≥ 5 per cell
Proportion test (large n)z (normal approx)np̂ ≥ 5 and n(1−p̂) ≥ 5

Case Study: Email Subject Line A/B Test

A growth marketer ran an email A/B test: Subject A (control) sent to 1,200 recipients, 156 opens (13.0%). Subject B (treatment) sent to 1,200 recipients, 192 opens (16.0%). Is this difference statistically significant?

Two-proportion z-test: pooled p̂ = (156+192)/2400 = 0.145. SE = √(0.145 × 0.855 × (1/1200 + 1/1200)) = 0.01440. z = (0.160 − 0.130) / 0.01440 = 2.08. Two-tailed p-value ≈ 0.037.

p=0.037 < 0.05: the marketer rejected H₀ and concluded Subject B significantly outperforms Subject A at the 5% level. However, she also computed the effect size: absolute lift = +3.0 percentage points (from 13% to 16%). At the company's email volume of 50,000/month, this translated to ~1,500 additional opens — practically meaningful enough to justify the switch.

Disclaimer

P-values measure evidence against H₀ under frequentist inference. Statistical significance at 0.05 does not imply practical significance. Always report effect sizes alongside p-values.

Frequently Asked Questions