Statistics Calculators
Free statistics tools with formula explanations, step-by-step examples, and real-world context. From mean and median to ANOVA and effect size — every calculator includes the math behind the result.
Maintained by CalcMulti Editorial Team · Last updated: February 2026
What Is Statistics?
Statistics is the branch of mathematics that deals with collecting, organising, analysing, interpreting, and presenting data. It underpins virtually every scientific field — from clinical trials and economic forecasting to machine learning and quality control. Statistics gives us the tools to extract meaningful conclusions from data that would otherwise be noise.
Descriptive Statistics
Summarises and describes the data you already have. It does not generalise beyond the sample.
- • Mean, median, mode — central tendency
- • Range, variance, standard deviation — spread
- • Percentiles, quartiles — position
- • Skewness, kurtosis — distribution shape
Inferential Statistics
Uses a sample to draw probability-based conclusions about a larger population.
- • Confidence intervals — range for a parameter
- • Hypothesis testing — p-values, t-tests, ANOVA
- • Regression — relationships between variables
- • Effect size — practical magnitude of a result
A critical distinction: when you have data for an entire population, you use population formulas (divide by n). When you have a sample, you use sample formulas (divide by n − 1). This correction — known as Bessel's correction — ensures sample estimates are unbiased estimators of population parameters.
Measures of Central Tendency
Mean · Median · Mode · Weighted Mean
Central tendency describes the centre of a dataset — a single representative value that summarises where most values cluster. There are three main measures, each suited to different data types and distributions. When observations carry different importance (different sample sizes, weights, or credits), the weighted mean is the correct measure.
Mean
x̄ or μΣx / n
- Best for
- Uses all data points; best for symmetric distributions
- Limitation
- Sensitive to outliers
- Example use
- Average exam score in a class
Median
MMiddle value when sorted
- Best for
- Robust to outliers; best for skewed data
- Limitation
- Ignores magnitude of extreme values
- Example use
- Median household income
Mode
MoMost frequent value
- Best for
- Works for categorical data
- Limitation
- May not be unique; not useful for continuous data
- Example use
- Most popular shoe size
When Mean Fails: Use the Weighted Mean
The arithmetic mean gives every observation equal weight. When data points represent different quantities — course credits for GPA, asset sizes for portfolio return, respondent counts for combined survey data — the weighted mean Σ(wᵢ × xᵢ) / Σwᵢ is the correct measure.
Example: a student earns 85% in a 3-credit course and 70% in a 1-credit course. Simple average = 77.5%. Weighted average = (85×3 + 70×1) / 4 = 81.25% — the correct GPA calculation.
Open Weighted Average Calculator →When to Use Mean vs Median vs Mode
Decision guide — pick the right measure for your data
Choosing the wrong measure of central tendency produces misleading summaries. The decision depends on three factors: the level of measurement (nominal, ordinal, interval/ratio), the shape of the distribution (symmetric, skewed, bimodal), and whether outliers are present.
| Situation | Mean | Median | Mode |
|---|---|---|---|
| Symmetric distribution, no outliers | ✅ Best | ✅ OK | — |
| Skewed distribution (e.g. incomes) | ⚠️ Misleading | ✅ Best | — |
| Data has extreme outliers | ⚠️ Pulled by outliers | ✅ Best | — |
| Categorical data (colours, sizes) | ❌ Not valid | ❌ Not valid | ✅ Best |
| Finding most popular item | — | — | ✅ Best |
| Normal (bell curve) distribution | ✅ Best | ✅ Same | ✅ Same |
| Bimodal distribution (two peaks) | ⚠️ Misleading | ⚠️ Misleading | ✅ Both modes |
| Small dataset (< 10 values) | ✅ OK | ✅ OK | ⚠️ Unstable |
| Reporting to non-technical audience | ✅ Familiar | ✅ "Typical value" | ✅ "Most common" |
Quick Decision Rule
Measures of Spread
Central tendency tells you where the data is centred. Spread (also called variability or dispersion) tells you how far values typically deviate from that centre. Two datasets can have identical means but completely different spreads — and that difference matters enormously in practice.
| Measure | Formula | Units | Use when | Calculator |
|---|---|---|---|---|
| Range | Max − Min | Same as data | Quick, rough estimate of spread | Calculate → |
| IQR (Q3 − Q1) | Q3 − Q1 | Same as data | Robust to outliers; used in box plots | Calculate → |
| Variance (σ²) | Σ(x − μ)² / n | Squared units | Mathematical derivations, ANOVA | Calculate → |
| Standard Deviation (σ) | √Variance | Same as data | Most practical reporting | Calculate → |
| Coefficient of Variation | (σ / μ) × 100 | % | Comparing spread across different scales | Calculate → |
Standard deviation is by far the most commonly reported measure of spread because it is in the same units as the data. Variance is useful internally but rarely reported to non-technical audiences. When comparing datasets measured in different units (e.g., height in cm vs weight in kg), use the coefficient of variation — it expresses spread as a percentage of the mean, making comparison valid. The IQR is the preferred spread measure for skewed data and is used in box plots and outlier detection.
Position Measures — Where Does a Value Rank?
Position measures describe where a specific value sits within a distribution — relative to all other values. Unlike central tendency (where the data clusters) or spread (how wide the data is), position answers: how does this particular value compare to the rest?
Z-Score
z = (x − μ) / σExpresses how many standard deviations a value is from the mean. A z-score of +1.5 means the value is 1.5 standard deviations above average. Negative z-scores fall below the mean.
- Best for:
- Comparing values across different datasets (different units, different scales)
- Example:
- Comparing a student's performance in Math vs English on different scoring systems
Percentile Rank
P = B / n × 100Tells you what percentage of the dataset falls at or below a given value. The 75th percentile means 75% of values are at or below that point. Percentiles are used in standardised tests, growth charts, and salary benchmarks.
- Best for:
- Ranking a value within a real dataset (no assumption of normal distribution required)
- Example:
- Determining which percentile of test-takers a score falls in
Z-Score vs Percentile — Which to Use?
| Condition | Z-Score | Percentile Rank |
|---|---|---|
| Data is approximately normally distributed | ✅ Preferred | ✅ OK |
| Data is skewed or non-normal | ⚠️ Use with caution | ✅ Preferred |
| Comparing across two different datasets | ✅ Best (unit-free) | ⚠️ Only if same reference group |
| Communicating to a non-technical audience | ⚠️ Less intuitive | ✅ "You scored higher than X%" |
| Population σ is known | ✅ Use z = (x−μ)/σ | — |
| Working from raw data only | ⚠️ Need mean + σ first | ✅ Calculate directly from data |
Probability Distributions
Normal · Binomial · Poisson · Geometric
A probability distribution describes the probability of each possible outcome of a random variable. Choosing the correct distribution for your data is a foundational skill in statistics — the wrong distribution leads to incorrect p-values, wrong predictions, and flawed models.
| Distribution | Type | Key parameter(s) | Use when | Calculator |
|---|---|---|---|---|
| Normal | Continuous | μ, σ | Symmetric, bell-shaped data; z-tests; natural measurements | Calculate → |
| Binomial | Discrete | n trials, p success | Counting successes in a fixed number of independent binary trials | Calculate → |
| Poisson | Discrete | λ (rate) | Counting events in a fixed time or space interval (calls/hour, defects/batch) | Calculate → |
| Geometric | Discrete | p (success prob) | Number of trials until the first success (sales calls, retries) | Calculate → |
| T-distribution | Continuous | degrees of freedom | Small-sample inference when σ is unknown; t-tests | Calculate → |
Normal Distribution — Key Facts
- • 68% of data falls within ±1σ of the mean
- • 95% falls within ±2σ (empirical rule)
- • 99.7% falls within ±3σ
- • Mean = Median = Mode (perfectly symmetric)
- • Foundation of z-scores and most parametric tests
Binomial → Normal Approximation
When n is large and p is not close to 0 or 1, the binomial distribution can be approximated by the normal distribution. Rule of thumb: use the approximation when np ≥ 5 and n(1−p) ≥ 5.
Mean = np, Standard deviation = √(np(1−p)). Apply the continuity correction (+0.5 or −0.5) for improved accuracy.
Binomial vs Normal Comparison →Data Shape — Skewness, Kurtosis & Outliers
Beyond centre and spread — the shape of your distribution matters
After computing mean and standard deviation, the next step is understanding the shape of your distribution. Shape affects which statistical tests are valid, whether parametric or non-parametric methods are appropriate, and how you should communicate your results. Three key shape measures are skewness, kurtosis, and the five-number summary.
Skewness
Measures asymmetry. Right-skewed (positive) distributions have a long tail to the right — mean > median. Left-skewed (negative) distributions have a long tail to the left — mean < median. Values near 0 indicate symmetry.
• |skewness| < 0.5 → approximately symmetric
• 0.5–1.0 → moderately skewed
• > 1.0 → highly skewed (consider log transform)
Kurtosis
Measures tail heaviness relative to a normal distribution. Excess kurtosis = 0 means normal (mesokurtic). Positive excess kurtosis means heavy tails and a sharp peak (leptokurtic) — extreme values occur more often than expected.
• Excess kurtosis = 0 → Normal (mesokurtic)
• > 0 → Heavy tails (leptokurtic, e.g. finance)
• < 0 → Light tails (platykurtic, e.g. uniform)
Five-Number Summary
Min, Q1, Median, Q3, Max form the five-number summary — the foundation of box plots. It describes shape without assuming a distribution: the spread of the middle 50% (IQR), and whether extreme values (outliers) are present.
• IQR = Q3 − Q1
• Mild outlier: < Q1 − 1.5×IQR or > Q3 + 1.5×IQR
• Extreme outlier: < Q1 − 3×IQR or > Q3 + 3×IQR
Outlier Detection — Two Methods Compared
| Property | IQR Method (Tukey Fences) | Z-Score Method (|z| > 3) |
|---|---|---|
| Requires normality? | No — robust to any distribution | Best for approximately normal data |
| Affected by outliers? | No — Q1/Q3 are resistant | Yes — outliers inflate mean and SD |
| Best for small n? | ✅ Reliable for n < 30 | ⚠️ Unreliable — use IQR instead |
| Detects mild outliers? | ✅ 1.5×IQR fence | ⚠️ Only extreme |z| > 3 cases |
Probability & Inferential Statistics
Inferential statistics bridges the gap between a sample and the larger population it represents. The foundation is probability theory — which quantifies uncertainty mathematically.
Z-Score & Normal Distribution
The z-score converts any value to the number of standard deviations from its distribution's mean. This allows comparison across different datasets. Under a normal distribution, approximately 68% of values fall within ±1σ, 95% within ±2σ, and 99.7% within ±3σ (the empirical rule).
Z-Score Calculator →Confidence Intervals
A 95% confidence interval means: if you repeated the sampling process 100 times, approximately 95 of the resulting intervals would contain the true population parameter. It quantifies the precision of an estimate — a wide interval means high uncertainty; a narrow interval means the sample provides strong evidence.
Confidence Interval Calculator →P-Values & Hypothesis Testing
A p-value is the probability of observing your result (or more extreme) if the null hypothesis were true. A small p-value (< 0.05 by convention) is evidence against the null hypothesis. Important: statistical significance ≠ practical importance. Always pair p-values with effect sizes.
P-Value Calculator →Conditional Probability & Bayes
Conditional probability asks: given that event A occurred, what is the probability of B? P(B|A) = P(A ∩ B) / P(A). Bayes' theorem reverses this: it lets you update a prior belief with new evidence. It is the foundation of Bayesian statistics and is used in spam filters, medical diagnosis, and machine learning.
Probability Calculator →Advanced Analysis — ANOVA, Mann-Whitney & Effect Size
Comparing multiple groups and measuring practical significance
When you move beyond two groups or need to quantify how meaningful a result is in practice, three tools become essential: ANOVA for comparing three or more group means, the Mann-Whitney U test for non-parametric group comparisons, and effect size measures for reporting practical significance alongside p-values.
One-Way ANOVA
Compares means across 3+ groups simultaneously using the F-statistic: ratio of between-group variance to within-group variance. A significant F (p < 0.05) means at least one group mean is different — but not which one.
F = MS_between / MS_within
Post-hoc: Tukey HSD, Bonferroni correction to identify which groups differ.
ANOVA Calculator →Mann-Whitney U Test
Non-parametric alternative to the independent t-test. Use when data is ordinal, clearly non-normal, or sample sizes are small (n < 30). Compares distributions via rank sums — no normality assumption required.
Effect size: rank-biserial correlation r = 1 − 2U/(n₁n₂). Interpret as: 0.1 small, 0.3 medium, 0.5 large.
Mann-Whitney U Calculator →Effect Size
Measures the practical magnitude of a result, independent of sample size. A p-value alone tells you if a result is likely real — effect size tells you if it matters.
• Cohen's d: 0.2 small, 0.5 medium, 0.8 large
• r: 0.1 small, 0.3 medium, 0.5 large
• η² (ANOVA): 0.01 small, 0.06 medium, 0.14 large
Which Test to Use — Decision Guide
| Situation | Recommended Test |
|---|---|
| Comparing 2 groups, continuous data, approximately normal | Independent t-test |
| Comparing 2 groups, ordinal data or non-normal | Mann-Whitney U test |
| Comparing 3+ groups, continuous data, approximately normal | One-way ANOVA |
| Comparing 3+ groups, ordinal or non-normal | Kruskal-Wallis test |
| One sample vs known value, σ known | Z-test |
| One sample vs known value, σ unknown | One-sample t-test |
| Categorical outcomes (frequencies) | Chi-square test |
| Relationship between two continuous variables | Pearson correlation / linear regression |
All Statistics Calculators
Arithmetic, weighted, geometric, and harmonic mean.
Central TendencyMiddle value of any dataset — robust to outliers.
Central TendencyMost frequent value; bimodal and multimodal support.
Central TendencyWeighted mean for GPA, portfolio returns, and survey data.
Central TendencyPopulation and sample standard deviation with variance.
SpreadPopulation and sample variance from raw data.
SpreadMax − min — the simplest measure of spread.
SpreadRelative variability as a % of the mean.
SpreadQ3 − Q1 for box plots and outlier detection.
SpreadMin, Q1, Median, Q3, Max — complete box plot data.
DescriptiveAbsolute, relative, and cumulative frequency tables.
DescriptiveDistribution shape — symmetric, skewed, heavy-tailed.
DescriptiveIQR method (Tukey fences) + z-score outlier detection.
DescriptiveStandardise any value in standard deviation units.
PositionPercentile rank of a value within a dataset.
PositionT-score from raw data with p-value for small samples.
PositionP(X < x), P(X > x), P(a < X < b) for any mean and σ.
DistributionP(X = k) and P(X ≤ k) for fixed-trial binary outcomes.
DistributionProbability of k events in a fixed interval (rate λ).
DistributionTrials until first success — P(X = k), mean, variance.
DistributionMinimum sample size for surveys and experiments.
InferenceOne-sample and two-sample t-test with p-value.
InferenceOne-way ANOVA F-test for 3+ group comparisons.
InferenceNon-parametric alternative to the independent t-test.
InferenceGoodness-of-fit χ² statistic with p-value.
InferenceSE of the mean with 95% and 99% confidence intervals.
Inference95% and 99% CIs for means and proportions.
InferenceOne-tail and two-tail p-values from z and t scores.
InferencePearson r and R² for two paired variables.
RelationshipsSlope, intercept, R² and predictions for y = mx + b.
RelationshipsCohen's d, r from t-test, and eta-squared (η²).
AdvancedBasic, conditional, and complement probability.
ProbabilityFormula Guides
Deep-dive explanations — where each formula comes from, how to apply it, and common mistakes to avoid.
Complete formula reference: mean, variance, z-score, t-test, ANOVA, regression, and more.
When to Use Mean, Median, or ModeDecision guide for choosing the right central tendency measure with real examples.
Descriptive vs Inferential StatisticsCore distinction — with examples of when each type applies.
Normal Distribution ExplainedBell curve, empirical rule (68-95-99.7), standardisation, and applications.
Hypothesis Testing BasicsH₀, H₁, significance level, p-value, Type I and Type II errors explained.
Probability Rules ExplainedAddition rule, multiplication rule, conditional probability, and Bayes theorem.
Mean Formula ExplainedHow arithmetic, weighted, geometric, and harmonic mean are derived.
Population vs Sample VarianceWhy we divide by n for population but n−1 for a sample (Bessel's correction).
Z-Score Formula GuideStandardisation, the normal distribution, and when to use z vs t.
Comparisons
Side-by-side analysis — when to choose one method over another, with worked examples.
When each measure is the right choice — with skewed data examples.
Variance vs Standard DeviationSame information, different units — which to report and why.
Mean vs Weighted MeanWhen sample sizes differ, the weighted mean is the correct measure.
Z-Score vs PercentileTwo ways to express position — normal-based vs rank-based.
Binomial vs Normal DistributionDiscrete vs continuous — and when the normal approximation is valid.
Sample vs PopulationBessel's correction, degrees of freedom, and why n−1 matters.
T-Test vs Z-TestSmall samples with unknown σ need t; large samples or known σ use z.
Common Statistical Errors
Confusing correlation with causation
Two variables moving together (correlation) does not mean one causes the other. Ice cream sales and drowning rates both rise in summer — but ice cream does not cause drowning; both are driven by hot weather. Always look for confounding variables before inferring causation.
Using mean for skewed data
The arithmetic mean is pulled toward outliers. When a dataset is right-skewed — such as income distributions, housing prices, or response times — report the median. A small number of extremely high values inflate the mean, making it unrepresentative of the typical case.
Misinterpreting p < 0.05 as proof
Statistical significance means the result is unlikely under the null hypothesis — it does not confirm the alternative hypothesis is true. A p-value of 0.04 means there is a 4% chance of this result if the null hypothesis were true. Multiple comparisons compound this problem: run 20 tests and expect one to be "significant" by chance at p < 0.05.
Reporting p-values without effect sizes
A p-value only tells you whether a result is statistically detectable — not whether it matters. With a large enough sample, even a trivial 0.1-point difference in means will be statistically significant. Always report effect size (Cohen's d, r, η²) alongside p-values to communicate practical importance.
Dividing by n instead of n−1 for sample variance
When computing variance from a sample (not the full population), the denominator must be n−1, not n. This is Bessel's correction — it produces an unbiased estimate of the population variance. Most calculators and software default to the correct formula, but be aware of which convention a tool uses.
Averaging percentages directly
You cannot take the arithmetic mean of percentages unless the sample sizes are equal. If store A has 20% return rate on 1,000 sales and store B has 80% return rate on 10 sales, the combined rate is not 50% — it is (200 + 8) / 1,010 ≈ 20.6%. Use weighted mean with sample sizes as weights.
Frequently Asked Questions
Educational use only. All calculators on this page use standard mathematical formulas from academic and public domain sources. Content is reviewed for accuracy by the CalcMulti Editorial Team. For research, clinical, or professional decisions, verify results with qualified software and subject-matter expertise. Last updated: February 2026.