![]() |
Brighton Webs Ltd. statistical and data services for industry |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Home Index Feedback |
Median In a set of numbers, half of them are less than or equal to the median and half are greater than or equal to it. When the set contains an odd number of values, the median is the central value, for example:
When the set contains an even number, then the mean of the two central values is used, for example:
The attraction of the median as a value to summarise a set of values is that reflects peoples experience of the data, for example if the median value of the journey time to work is 20 minutes, then you know that 50% of journeys will be less than 20 minutes and the rest will be longer. The median is less sensitive to extreme values than the mean (average) because it is based on the number of values, not their magnitude. Calculation Unlike the mean (average), there is no formula for calculating the mean, it is derived from a procedure, the first step of which is sort the data set into ascending order, the second is to identify the mid point(s) and obtain the median values Data for Example Calculation The data for the sample calculation is the running time of 50 full-length English language drama films released to the UK market during 2006. The frequency diagram for the data set is shown below: The unsorted data is:
The minimum and maximum values are 85 and 149 respectively and the mean (average) is 107.5 Example calculation
Use of the median In collecting the data for this topic, it became clear that there were significant variations in film running times between cultures and genres, for example Hindi language romances could be more than an hour longer than English language dramas. For the purposes of this discussion, I've opted for a trivial use of the data. If you are going to the cinema, you can say that you have a 50/50 chance of being out of your seat in less than 106.5 minutes. Thus the median has some connection with personal experience. If after doing the analysis, you become aware of a long running film (the 51st value of 168 minutes), there is only a small change in the median value from 106.5 to 107 minutes. It is because of this link between experience and relative insensitivity to extreme values, that the median (and quartiles and deciles) is widely used for socio economic data. Spreadsheets Spreadsheets offer a relatively painless way of executing many statistical procedures, both Microsoft's Excel and Google Spreadsheet document have a median function,
The arguments can be either ranges or valid numbers, e.g.
Empty cells are ignored. Note for Programmers A lot of statistical literature is based on tables of data where the first entry referenced as 1 e.g. x(1) and the last entry n e.g. x(n) where n is also the number of items in the table. In programming languages (e.g. c++, vb etc.) tables are often implemented as arrays where the first entry is referenced as 0 e.g. x(0) and the last item is x(n-1). Thus some care is required to implement statistical methods in code. Page updated: 12-Nov-2007 |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
For more information: info@brighton-webs.co.uk |