Statistics Formulas — Complete Reference Guide
By CalcMulti Editorial Team··10 min read
This reference guide collects the core formulas used in descriptive statistics, probability, and inferential statistics. Each formula is presented with its notation, what each symbol means, and a worked example. Bookmark this page as your go-to statistics formula sheet.
Formulas are organised by topic: start with descriptive statistics for summarising data, move to probability for modelling uncertainty, and finish with inferential statistics for drawing conclusions from samples.
Central Tendency
Central tendency measures describe where the centre of a dataset falls. The three main measures are the arithmetic mean, median, and mode — each appropriate for different data types and distributions.
| Measure | Formula | Use when | Sensitive to outliers? |
|---|---|---|---|
| Arithmetic Mean | x̄ = Σx / n | Symmetric data, no extreme outliers | Yes — strongly |
| Weighted Mean | x̄w = Σ(wᵢxᵢ) / Σwᵢ | Values have different importance/frequency | Yes |
| Geometric Mean | GM = (x₁×x₂×…×xₙ)^(1/n) | Multiplicative data: growth rates, ratios | Less than arithmetic mean |
| Harmonic Mean | HM = n / Σ(1/xᵢ) | Rates and speeds (distance/time) | Yes — to very small values |
| Median | Middle value when sorted | Skewed data, ordinal data, outliers present | No — robust |
| Mode | Most frequent value(s) | Categorical data, multimodal distributions | No |
Measures of Spread (Variability)
Spread measures quantify how dispersed values are around the centre. A low spread means values cluster tightly; a high spread means they are widely scattered. The choice of spread measure depends on data type and whether outliers are present.
| Measure | Formula | Units | Notes |
|---|---|---|---|
| Range | max − min | Same as data | Simple but heavily influenced by extremes |
| IQR | Q3 − Q1 | Same as data | Middle 50%; robust to outliers |
| Population Variance | σ² = Σ(xᵢ − μ)² / n | Squared units | Use when data IS the full population |
| Sample Variance | s² = Σ(xᵢ − x̄)² / (n−1) | Squared units | Use when data is a sample; Bessel's correction |
| Population SD | σ = √σ² | Same as data | Interpret directly in original units |
| Sample SD | s = √s² | Same as data | Most common reported measure of spread |
| Coefficient of Variation | CV = (s / x̄) × 100% | % | Relative spread — compare datasets with different units |
| Standard Error | SE = s / √n | Same as data | Precision of the sample mean estimate |
Position and Standardisation
Position measures locate a specific value within a distribution. Z-scores standardise values to a common scale regardless of original units, enabling direct comparison across different datasets.
| Measure | Formula | Interpretation |
|---|---|---|
| Z-Score (population) | z = (x − μ) / σ | Standard deviations above/below the population mean |
| Z-Score (sample) | z = (x − x̄) / s | Standard deviations above/below the sample mean |
| Percentile rank | PR = (# values < x) / n × 100 | Percentage of data below value x |
| Quartile Q1 | Median of lower half | 25th percentile — 25% of data lies below |
| Quartile Q3 | Median of upper half | 75th percentile — 75% of data lies below |
| Tukey Outlier Fence | Q1 − 1.5×IQR and Q3 + 1.5×IQR | Values outside are potential outliers |
Probability Rules
Probability quantifies uncertainty. All probabilities must satisfy 0 ≤ P(A) ≤ 1, and the probabilities of all possible outcomes must sum to 1. The four rules below cover the building blocks of most probability calculations.
| Rule | Formula | When to use |
|---|---|---|
| Complement rule | P(Aᶜ) = 1 − P(A) | Finding "at least one" or "not A" scenarios |
| Addition rule (mutually exclusive) | P(A ∪ B) = P(A) + P(B) | Events that cannot both occur simultaneously |
| Addition rule (general) | P(A ∪ B) = P(A) + P(B) − P(A ∩ B) | Events that can overlap |
| Multiplication rule (independent) | P(A ∩ B) = P(A) × P(B) | Events where one does not affect the other |
| Multiplication rule (dependent) | P(A ∩ B) = P(A) × P(B|A) | Events where one affects the probability of the other |
| Conditional probability | P(B|A) = P(A ∩ B) / P(A) | Probability of B given A has already occurred |
| Bayes' theorem | P(A|B) = P(B|A) × P(A) / P(B) | Updating probability when new evidence arrives |
Key Probability Distributions
Probability distributions describe how values are spread across possible outcomes. Choosing the right distribution is the foundation of correct statistical modelling.
| Distribution | Formula | Mean | Variance | Use for |
|---|---|---|---|---|
| Normal | f(x) = (1/σ√2π) × e^−½((x−μ)/σ)² | μ | σ² | Continuous symmetric data; CLT approximation |
| Binomial | P(X=k) = C(n,k) × pᵏ × (1−p)^(n−k) | np | np(1−p) | Fixed n trials, success/failure outcomes |
| Poisson | P(X=k) = λᵏ × e^−λ / k! | λ | λ | Count of rare events in an interval |
| t-distribution | Heavy-tailed; df parameter | μ (df>1) | df/(df−2) | Small samples with unknown σ |
| Chi-square | Sum of squared standard normals | df | 2×df | Categorical data tests; variance tests |
| F-distribution | Ratio of two chi-square variates | df₂/(df₂−2) | — | ANOVA; comparing variances |
Inferential Statistics Formulas
Inferential statistics uses sample data to draw conclusions about a population. The core tools are hypothesis tests (which produce a p-value) and confidence intervals (which produce a plausible range for the parameter).
| Test | Formula | df | Use for |
|---|---|---|---|
| One-sample z-test | z = (x̄ − μ₀) / (σ / √n) | — | Mean vs known μ₀, σ known |
| One-sample t-test | t = (x̄ − μ₀) / (s / √n) | n−1 | Mean vs known μ₀, σ unknown |
| Two-sample Welch t | t = (x̄₁ − x̄₂) / √(s₁²/n₁ + s₂²/n₂) | Welch–Satterthwaite approx. | Comparing two means, unequal variance |
| Chi-square GoF | χ² = Σ(O − E)² / E | k − 1 | Observed vs expected frequencies |
| Chi-square independence | χ² = Σ(O − E)² / E | (r−1)(c−1) | Association between two categorical variables |
| CI for mean (t) | x̄ ± t* × s/√n | — | Unknown σ — uses t critical value |
| CI for proportion (z) | p̂ ± z* × √(p̂(1−p̂)/n) | — | Binary outcome — uses z critical value |
Regression Formulas
Simple linear regression models the linear relationship between one predictor variable x and one outcome variable y. The goal is to find the line y = b₀ + b₁x that minimises the sum of squared residuals (the ordinary least squares criterion).
The slope b₁ tells you how much y changes on average for each unit increase in x. The intercept b₀ is the predicted value of y when x = 0. R² (the coefficient of determination) tells you what proportion of the variance in y is explained by x — ranging from 0 (no relationship) to 1 (perfect linear relationship).
| Quantity | Formula | Interpretation |
|---|---|---|
| Slope | b₁ = Σ(xᵢ−x̄)(yᵢ−ȳ) / Σ(xᵢ−x̄)² | Change in y per unit increase in x |
| Intercept | b₀ = ȳ − b₁x̄ | Predicted y when x = 0 |
| Pearson r | r = Σ(xᵢ−x̄)(yᵢ−ȳ) / (n−1)sₓsᵧ | Linear correlation strength: −1 to +1 |
| R-squared | R² = r² | Proportion of variance in y explained by x |
| Residual | e = yᵢ − ŷᵢ | Difference between observed and predicted y |
| RMSE | √(Σeᵢ² / (n−2)) | Typical prediction error in original units |
Related Calculators
Compute arithmetic, weighted, geometric mean
Variance CalculatorPopulation and sample variance
Z-Score CalculatorStandardise any value
T-Test CalculatorTwo-sample hypothesis test
Correlation CalculatorPearson r between two variables
Normal Distribution CalculatorProbabilities under the bell curve
Statistics HubAll statistics calculators & guides
Frequently Asked Questions
Educational use only. Content is based on publicly documented mathematical formulas and reviewed for accuracy by the CalcMulti Editorial Team. Last updated: February 2026.