Mean vs Median — Which Should You Use?

By CalcMulti Editorial Team··7 min read

Both the mean and the median measure the centre of a dataset — but they define "centre" differently, and for skewed or outlier-heavy data they can give wildly different answers. Using the wrong one produces a summary statistic that is technically correct but deeply misleading.

The classic example: the mean US household income is pulled far above the median by a small number of extremely high earners. A politician citing the mean income sounds more prosperous; one citing the median income sounds more typical. Both numbers come from the same data — the choice of measure changes the story.

Mean (x̄)
VS
Median

Side-by-Side Comparison

PropertyMean (x̄)Median
DefinitionSum of all values ÷ countMiddle value in sorted data
Formulax̄ = Σx / nx[(n+1)/2] for odd n; average of two middle values for even n
Sensitive to outliers?Yes — pulled by extreme valuesNo — only position matters
Data type requiredNumerical (interval or ratio scale)Numerical or ordinal
For symmetric distributionsMean = MedianMean = Median
For right-skewed dataMean > MedianBetter represents typical value
For left-skewed dataMean < MedianBetter represents typical value
Mathematical tractabilityHigh — used in SD, regression, ANOVALower — harder to work with algebraically
UniquenessAlways uniqueUnique (interpolated for even n)
Effect of adding a valueChanges meanMay or may not change median

When the Mean Is the Right Choice

The arithmetic mean is preferred when data is approximately symmetric with no extreme outliers. In this situation, the mean and median are nearly equal, and the mean has practical advantages: it uses every data point, is mathematically tractable (you can derive variance, standard deviation, confidence intervals, and regression coefficients from it), and is the foundation of most inferential statistics.

Specific contexts where mean excels: exam scores in a normal class distribution; heights and weights in a homogeneous population; physical measurements (lengths, temperatures, reaction times) from a controlled lab setting; financial returns when modelling expected value; and any situation where you will perform further statistical analysis (t-tests, ANOVA, regression all assume you are working with means).

Worked example — exam scores {65, 70, 72, 74, 76, 78, 80}: Mean = 515/7 = 73.6. Median = 74 (4th value when sorted). The two measures are nearly identical, confirming the data is roughly symmetric. Either would be a fair representation; the mean is slightly preferred because it can feed directly into standard deviation and confidence interval calculations.

Rule of thumb: if the mean and median are within 5–10% of each other, the data is roughly symmetric and the mean is appropriate. If they diverge significantly, investigate the distribution shape before deciding.

When the Median Is the Right Choice

The median becomes superior whenever the data is skewed or contains outliers. Because the median depends only on rank (not magnitude), extreme values cannot distort it. This makes the median a robust statistic — statisticians describe robustness as resistance to violations of ideal conditions.

Income and wealth data is the canonical example. US median household income (2024) is approximately $80,000. Mean household income is over $115,000 — pulled up by billionaires and high earners. If you want to know what a typical family earns, the median gives the honest answer. The mean tells you the average if the entire national income were distributed equally — a hypothetical that does not reflect lived reality.

House prices, rent, and property values follow the same pattern. A neighbourhood with 9 houses worth $400,000 each and one mansion worth $5 million has a mean property value of $860,000 — higher than 9 out of 10 houses. The median of $400,000 reflects what a typical house in that neighbourhood is actually worth.

Response times in web performance and medical testing are typically right-skewed: most responses cluster in a reasonable range, but a small number take extremely long. The 95th percentile and median are far more useful than the mean for setting performance targets, because a few extreme outliers can make the mean look poor even when 90% of users have a fast experience.

Using Skewness to Make the Decision

The relationship between mean and median reveals the shape of a distribution. This is sometimes called Pearson's rule: for right-skewed (positively skewed) distributions, mean > median > mode. For symmetric distributions, mean ≈ median ≈ mode. For left-skewed (negatively skewed) distributions, mean < median < mode.

Right-skewed data (long right tail): income, wealth, sales figures, website traffic, insurance claims, city populations. In all these cases the median is more representative of "typical." The mean is still valid and important — particularly for calculating totals (total income = mean × n) — but should not be used alone to describe the "typical" case.

Left-skewed data (long left tail): age at death in a developed country (most people live to old age; few die very young), test scores after an easy exam (most scores cluster near the maximum), time to complete a simple task. Left skew is less common than right skew in practical data but follows the same rule: use median for the "typical" value.

When both are needed: the best practice in professional reporting is to provide both. "Mean income: $115,000 (Median: $80,000)" immediately signals skewness to the reader and is more transparent than either alone. Academic papers often report "mean ± standard deviation (median [IQR])" for this reason.

How a Single Outlier Changes Each Measure

A concrete demonstration of outlier sensitivity. Dataset: {10, 12, 14, 15, 16, 18, 20}. Mean = 105/7 = 15.0. Median = 15 (4th value). Mean and median agree — the data is symmetric.

Now add one outlier: {10, 12, 14, 15, 16, 18, 20, 200}. New mean = 305/8 = 38.1 — more than doubled, and now higher than 7 of the 8 values. New median = (15+16)/2 = 15.5 — barely changed. The outlier pulled the mean 154% above its original value while barely affecting the median.

This illustrates the breakdown point — the proportion of data that can be corrupted before a statistic gives a misleading answer. The mean has a breakdown point of 0%: even one extreme outlier can ruin it. The median has a breakdown point of 50%: up to half the data can be replaced by arbitrarily extreme values before the median is meaningfully affected.

Practical implication: if your dataset comes from a process that could produce errors, extreme values, or heavy-tailed distributions (anything involving money, time, or human behaviour), check both mean and median and report both if they differ substantially.

Summary

Use the median when your data is skewed or contains outliers — it gives the honest "typical" value. Use the mean when data is symmetric, you need mathematical tractability, or you will perform further statistical analysis. When in doubt, report both.

  • Mean for: exam scores, lab measurements, symmetric data, any dataset feeding into t-tests, ANOVA, or regression
  • Median for: income, wealth, house prices, response times, survival times, any right-skewed or outlier-prone data
  • Both when: you are reporting to a general audience, or mean and median differ by more than 10%
  • Quick test: if mean > median by a meaningful margin, data is right-skewed — prefer median for "typical" description

Frequently Asked Questions

Educational use only. Content is based on publicly documented mathematical formulas and reviewed for accuracy by the CalcMulti Editorial Team. Last updated: February 2026.