Normal Distribution Explained

By CalcMulti Editorial Team··8 min read

The normal distribution — also called the Gaussian distribution or bell curve — is the most important probability distribution in statistics. It describes an enormous range of natural and social phenomena and underlies nearly every parametric statistical test.

This guide explains what the normal distribution is, why it appears so often in the real world, how to use it for probability calculations, and — critically — when your data is NOT normally distributed and what to do instead.

Formula

f(x) = (1 / σ√2π) × e^[−½((x−μ)/σ)²] where μ = mean, σ = standard deviation

What Is the Normal Distribution?

A normal distribution is a symmetric, bell-shaped probability distribution completely described by two parameters: its mean μ (which controls where the centre is) and its standard deviation σ (which controls how wide or narrow the bell is). All normal distributions have the same symmetric shape — they just differ in location and spread.

The curve is defined so that the total area under it equals 1 (certainty). The probability of a value falling in a specific range equals the area under the curve between those two points. Because the curve extends infinitely in both directions, any specific value has probability zero — you can only calculate the probability of falling within an interval.

Three key facts: (1) The mean, median, and mode are all equal and located at the centre of the distribution. (2) The distribution is perfectly symmetric around the mean — 50% of values are above the mean, 50% below. (3) About 99.7% of values lie within 3 standard deviations of the mean — the basis of the 3-sigma rule in quality control.

The 68–95–99.7 Rule (Empirical Rule)

The empirical rule is the most practical fact about normal distributions. It tells you what percentage of values fall within 1, 2, or 3 standard deviations of the mean without needing to compute integrals.

Range% of values within rangeExample: heights (μ=170cm, σ=10cm)
μ ± 1σ68.27%160cm to 180cm — about 68% of adults
μ ± 2σ95.45%150cm to 190cm — about 95% of adults
μ ± 3σ99.73%140cm to 200cm — nearly all adults
Beyond ±3σ0.27% (1 in 370)Extremely unusual — signals a potential outlier or measurement error
Beyond ±4σ0.0063% (1 in 15,787)Used in quality control: "4-sigma" events
Beyond ±6σ0.000000197% (1 in 506M)Six Sigma quality target — near-zero defects

The Standard Normal Distribution and Z-Scores

The standard normal distribution is a special normal distribution with μ = 0 and σ = 1. Any normal distribution can be converted to standard normal by calculating z-scores: z = (x − μ) / σ. This standardisation allows you to use a single table (or calculator) for all normal distributions.

A z-score tells you how many standard deviations a value is from the mean. z = 1.5 means the value is 1.5 SDs above the mean; z = −0.8 means 0.8 SDs below. The percentile corresponding to any z-score is given by the standard normal CDF (Φ).

Worked example: IQ scores are normally distributed with μ = 100, σ = 15. What percentile is an IQ of 130? z = (130 − 100) / 15 = 2.0. Φ(2.0) = 97.7th percentile. So an IQ of 130 is higher than 97.7% of the population.

Why Do So Many Things Follow a Normal Distribution?

The Central Limit Theorem (CLT) is the reason. The CLT states: if you take many independent random measurements and add them together (or average them), the sum will be approximately normally distributed — regardless of what distribution each individual measurement follows. The approximation improves as sample size increases.

Most natural measurements are the sum of many small independent factors. Human height is influenced by hundreds of genes, nutrition factors, and environmental conditions — each adding a small amount. Their sum is normally distributed. Similarly: measurement errors (sum of many small equipment and human errors), exam scores (sum of many individual question results), and stock price returns over long periods all tend toward normality via the CLT.

This is why normal distribution appears in: heights and weights; test scores and IQ; blood pressure measurements; manufacturing tolerances; astronomical measurement errors; financial asset returns (approximately, over short periods). The CLT also explains why sample means follow a normal distribution even when the underlying data does not — which is what allows t-tests and z-tests to work.

When Data Is NOT Normal — and What to Do

Many real-world datasets are not normally distributed. Assuming normality when it does not hold leads to incorrect confidence intervals, invalid p-values, and wrong decisions.

Data typeTypical distributionWhat to use instead
Income, house prices, response timesRight-skewed (log-normal)Log-transform data; report median; use non-parametric tests
Count data (arrivals, defects, rare events)Poisson or negative binomialPoisson regression; chi-square tests
Success/failure proportionsBinomialBinomial test; logistic regression
Survival times, waiting timesExponential or WeibullSurvival analysis; Cox proportional hazards
Percentages and proportions (0–1)Beta distributionBeta regression; logit transformation
Ordinal data (Likert scales)No standard distributionMedian; Mann-Whitney U test; ordinal logistic regression
Very small samples (n < 30)Unknown distributiont-distribution; bootstrap methods; non-parametric tests

How to Check if Your Data Is Normal

Visual methods (always start here): Histogram — does it look roughly bell-shaped and symmetric? Q-Q plot (quantile-quantile plot) — if data is normal, points fall roughly on a straight diagonal line. Box plot — symmetric box with median in the middle and roughly equal whiskers suggests normality.

Numerical methods: Skewness — values close to 0 suggest symmetry (normal). Kurtosis — values near 3 suggest normal tail weight. Shapiro-Wilk test — formal statistical test for normality; most powerful for n < 50. Kolmogorov-Smirnov test — for larger samples. Lilliefors test — variant of KS test when μ and σ are estimated from data.

Rule of thumb: For large samples (n > 30), the Central Limit Theorem means many parametric tests are robust to non-normality. For small samples (n < 30) with skewed data, use non-parametric alternatives (Mann-Whitney U instead of t-test; Wilcoxon signed-rank instead of paired t-test; Kruskal-Wallis instead of ANOVA).

The most important advice: Do not rely solely on normality tests for the decision. With large samples, normality tests become oversensitive and reject normality even for trivially small deviations. With small samples, they lack power. Always combine visual inspection with numerical checks.

Frequently Asked Questions

Educational use only. Content is based on publicly documented mathematical formulas and reviewed for accuracy by the CalcMulti Editorial Team. Last updated: February 2026.