Calculating the Mean
When most people say "average," they mean the arithmetic mean. Add up all the values and divide by how many there are. That's it. You can think of it as the balancing point: if every value in the dataset were adjusted to equal the mean, the total wouldn't change.
Dataset: 14, 19, 22, 22, 28, 31, 35
Sum = 14 + 19 + 22 + 22 + 28 + 31 + 35 = 171
Count = 7
Mean = 171 ÷ 7 = 24.43 (to 2 d.p.)
The mean uses every value, which sounds great, but it's also why it can mislead you. One extreme value, high or low, can pull the mean far away from where most of the data actually is. If one person scores 100 and everyone else scores 50, the mean looks fine on paper but it's kind of lying to you.
For a weighted mean, multiply each value by its weight, sum the products, then divide by the total of the weights. Teachers use it all the time when different assignments count for different portions of your final grade.
Finding the Median
The median is the middle value of a sorted dataset. Sort everything from smallest to largest, find the middle, and that's it. Half the values sit below it, half above. And here's the key thing: extreme values don't move the median at all.
Odd Number of Values
(Already sorted - 7 values)
Middle position = (7 + 1) ÷ 2 = position 4
Median = 18
Even Number of Values
When the count is even, there's no exact middle, so you take the two central values and average them.
(6 values - two middle values are at positions 3 and 4)
Middle values: 13 and 17
Median = (13 + 17) ÷ 2 = 15
Always sort the data first. Seriously, always. Finding the median from an unsorted list is one of the most common mistakes, and the position only makes sense when the numbers are in order.
Identifying the Mode
The mode is simply the value that shows up most often. There's no formula. You count how many times each value appears and report the one with the highest count.
Frequency: 4→1, 7→2, 9→1, 11→1, 13→3, 15→1
Mode = 13 (appears 3 times)
A dataset can have more than one mode. If two values are tied for the top frequency, it's bimodal. Three or more and it's multimodal. If everything appears exactly once, there's no mode.
Dataset: 3, 5, 5, 8, 9, 9, 12
Mode = 5 and 9 (both appear twice)
The mode is the only average that works with categorical data, stuff that falls into named groups rather than numbers. The most popular item on a menu, the most common shoe size sold, the most frequent answer in a survey, those are all modal values. You can't really take a mean of shoe sizes and have it mean anything useful.
Choosing the Right Average
Which average you use depends on your data and what you're actually trying to find out.
| Situation | Best Measure | Reason |
|---|---|---|
| Daily temperatures over a month | Mean | Symmetrically distributed, no extreme outliers |
| House prices in a neighbourhood | Median | A few luxury properties distort the mean |
| Most common dress size sold | Mode | Categorical; mean of sizes is not meaningful |
| Exam marks across a class | Mean | Useful when every mark contributes equally |
| Hospital waiting times | Median | A small number of very long waits skew the mean |
| Survey: favourite colour | Mode | Non-numerical - only frequency applies |
| Wages in a company | Median | Executive salaries create extreme outliers |
So basically: mean when the data is balanced and there are no wild outliers, median when things are skewed or there's a value dragging it in one direction, and mode when you want the most common value or when your data isn't even numeric.
The Effect of Outliers on Mean vs. Median
An outlier is a value way outside the normal range of the dataset. The mean is very sensitive to outliers. The median barely cares. And that difference matters a lot in real life.
Take this example: weekly sales figures for a team of seven sales reps.
Mean = (41 + 44 + 47 + 49 + 52 + 55 + 198) ÷ 7
= 486 ÷ 7 = 69.4 units
Median = middle value (4th of 7) = 49 units
That 198 figure, probably a one-off bulk order, drags the mean all the way up to 69.4. But five out of seven reps sold between 41 and 55 units. The mean of 69.4 doesn't represent what a normal week looks like for any of them. The median of 49 is honest. The mean isn't.
Now take that outlier out:
Mean = 288 ÷ 6 = 48 units
Median = (47 + 49) ÷ 2 = 48 units
Without the outlier, the mean and median are basically the same, which tells you the remaining data is pretty symmetrical. When mean and median are close, the mean is trustworthy. When they're far apart, go with the median and start looking for what's causing the gap.