Population vs Sample Variance — Why n vs n−1?
By CalcMulti Editorial Team··9 min read
Every statistics student eventually asks: why does sample variance divide by n−1 instead of n? The answer involves one of the most elegant ideas in statistics — the concept of bias in estimation. Getting this wrong produces variance estimates that are systematically too small, which in turn makes standard deviations too small, confidence intervals too narrow, and hypothesis tests too eager to declare significance.
This guide explains the two formulas, where the difference comes from, why it matters, and the mathematical intuition behind Bessel's correction — without requiring calculus.
Formula
Population: σ² = Σ(xᵢ − μ)² / n | Sample: s² = Σ(xᵢ − x̄)² / (n − 1)
The Two Formulas Side by Side
Both formulas measure the average squared deviation from the mean. The only difference is the denominator: n for population variance, n−1 for sample variance.
Population variance (σ²) divides by n. It is used when your dataset IS the entire population — every single member of the group you are studying. Example: you have the exact score of every student in one specific class of 30. You want to describe the spread of that class, not estimate the spread of a larger group. Divide by n = 30.
Sample variance (s²) divides by n−1. It is used when your dataset is a SAMPLE drawn from a larger population, and you want to estimate the true population variance. Example: you surveyed 30 randomly selected students from a school of 1,200 and want to estimate the variance for the whole school. Divide by n−1 = 29.
Worked example with the same data, different intent: Dataset {4, 7, 13, 2}. n = 4. Mean = (4+7+13+2)/4 = 6.5. Deviations: −2.5, +0.5, +6.5, −4.5. Squared deviations: 6.25, 0.25, 42.25, 20.25. Sum = 69.0. Population variance: 69.0/4 = 17.25. Sample variance: 69.0/3 = 23.0.
| Property | Population Variance (σ²) | Sample Variance (s²) |
|---|---|---|
| Symbol | σ² (sigma squared) | s² |
| Denominator | n | n − 1 |
| When to use | Data IS the full population | Data is a sample from a larger population |
| Result | Exact population spread | Unbiased estimate of population spread |
| Bias | Unbiased (exact) | Unbiased estimator of σ² |
| Example | All 30 students in one class | 30 students sampled from 1,200 |
The Bias Problem — Why Dividing by n Is Wrong for Samples
When you compute variance from a sample, you use the sample mean x̄ — not the true population mean μ. This creates a subtle but systematic problem: x̄ is the value that minimises the sum of squared deviations from your specific sample. No other value would produce a smaller sum of squared deviations for your particular data.
Because x̄ is optimised for your sample, the deviations (xᵢ − x̄) are systematically smaller than the deviations from the true population mean (xᵢ − μ). When you sum the squared deviations and divide by n, you are dividing a sum that is already biased downward. The result — the biased variance — consistently underestimates the true population variance σ².
A simple demonstration: suppose the population is {1, 5, 9} with μ = 5 and σ² = (16+0+16)/3 = 10.67. Now take the sample {1, 5}. Sample mean x̄ = 3. Biased estimate (÷n): [(1−3)²+(5−3)²]/2 = [4+4]/2 = 4. Unbiased estimate (÷n−1): [4+4]/1 = 8. True population variance is 10.67. Neither sample estimate is exactly right (that is sampling variation), but on average across many possible samples of size 2, dividing by n−1 will hit the true value; dividing by n will consistently be too low.
The mathematical proof shows that E[biased estimator] = σ² × (n−1)/n. Multiplying by n/(n−1) corrects this: E[s²] = σ². This is exactly what dividing by n−1 achieves.
Bessel's Correction — The Intuition
Friedrich Bessel (1784–1846) was a German mathematician and astronomer who formalised this correction while working on astronomical measurement errors. The correction that bears his name answers: by what factor should we scale up the biased variance to remove the downward bias?
The answer is n/(n−1). Multiplying the biased variance (÷n) by n/(n−1) gives the unbiased variance (÷n−1). These two operations are equivalent: σ²_biased × n/(n−1) = Σ(xᵢ−x̄)²/n × n/(n−1) = Σ(xᵢ−x̄)²/(n−1) = s².
The intuition: when you draw a sample, you have n data points but only n−1 of them are 'free' to vary. Once you know n−1 values and the sample mean, the last value is completely determined (it must make the mean come out right). This is what statisticians call degrees of freedom. You have n observations but spend 1 degree of freedom computing x̄, leaving n−1 free degrees. Dividing by n−1 accounts for this constraint.
The effect of the correction diminishes as n grows: for n = 5, dividing by 4 vs 5 is a 25% difference. For n = 100, it is only 1%. For n = 1,000, it is 0.1%. This is why the distinction matters most in small samples (n < 30) and is negligible for large datasets.
Practical Decision Guide — Which Formula to Use
Use population variance (÷n) when: you measured every single member of the group you care about; you have no interest in generalising to a wider group; you are describing a closed system (all employees in a specific company, all products in a finished batch, all students in a single class during a specific term).
Use sample variance (÷n−1) when: you collected data from a subset of a larger group; you want to make inferences or predictions about the wider population; your data comes from a survey, experiment, clinical trial, or any process where you could in principle collect more data. This is the correct choice in the vast majority of real-world statistical work.
When you are not sure: default to sample variance (n−1). Most statistical software — including Excel's VAR(), Python's numpy.var(ddof=1), and R's var() — uses the sample formula by default. Excel has separate functions: VAR.S() for sample, VAR.P() for population. Python numpy.var() defaults to population (ddof=0); use ddof=1 for sample.
It rarely matters for large samples. For n ≥ 100, the difference between n and n−1 in the denominator is less than 1%. The distinction is most consequential when n is small (2–30), which is common in pilot studies, quality control sampling, and experimental research.
Related Calculators
Compute σ² and s² with full deviation table
Standard Deviation CalculatorSquare root of variance, same units as data
Variance vs Standard DeviationWhich to report and why
Mean CalculatorRequired first step before variance
Coefficient of Variation CalculatorRelative spread as % of mean
Statistics HubAll statistics calculators & guides
Frequently Asked Questions
Educational use only. Content is based on publicly documented mathematical formulas and reviewed for accuracy by the CalcMulti Editorial Team. Last updated: February 2026.